Posts

Showing posts from November, 2025

Creating Data Visualizations with ChatGPT

Image
For this assignment, I utilized ChatGPT to develop an R visualization using the mtcars dataset. My objective was to create a scatterplot that examines the relationship between vehicle weight and fuel efficiency while distinguishing between cars with different cylinder configurations through color coding and professional styling. The experience proved to be remarkably efficient. ChatGPT produced well-structured ggplot2 code that executed without errors on the first attempt. The AI demonstrated strong design intuition by recommending appropriate aesthetic elements, including a cohesive color scheme, point transparency for overlapping data, and a minimalist theme that enhanced readability without overwhelming the visualization. The primary obstacle I encountered involved dataset selection. My initial attempt required installing additional packages, which presented compatibility issues. However, ChatGPT adapted quickly by suggesting the built-in mtcars dataset as an alternative, which prov...

R Markdown Assignment Reflection

Exploring R Markdown for Reproducible Research This assignment provided hands-on experience with R Markdown, a dynamic document format that integrates statistical analysis with scholarly writing. Unlike traditional workflows that separate data processing from report generation, R Markdown unifies these steps into a single, reproducible framework. Learning Outcomes: Working with R Markdown revealed the efficiency of combining prose with executable code. The Markdown syntax facilitated straightforward text formatting, while LaTeX integration enabled precise mathematical expressions. Most notably, embedding R code chunks within the narrative structure allowed for automatic generation and updating of analytical results, a significant improvement over manual data transfer methods. Implementation Experience: The document compilation process demonstrated R Markdown's capability to execute sequential code chunks and seamlessly integrate outputs. The mtcars dataset analysis, including summa...

Social Network Visualization with Python

Image
  For this assignment, I created a social network graph using Python with the NetworkX and Plotnine libraries. The graph shows 10 nodes (labeled A through J) connected by edges representing relationships between them. What worked well:  Once I got everything set up, the code ran smoothly and generated a clean visualization. The Plotnine library made it easy to customize the appearance of the graph, and the spring layout algorithm created a nice spread of the nodes. What problems did I encounter:  The main issue I had was getting the graph to actually display. The code would run without errors, but the visualization window wouldn't show up. I ended up having to save it as a PNG file instead of viewing it directly, which worked perfectly. Would I use this method again?  Honestly, it depends on the project. For quick visualizations, this method works well once you have everything installed. However, the display issues were annoying. If I needed to do more complex networ...

Module 11: Debugging R Functions

For this assignment, I worked on debugging a function that identifies outlier rows in a numeric matrix using the Tukey rule. The original code had a bug that I had to find and fix. The Original Buggy Code The function was supposed to flag rows where every column contains an outlier. Here's what I started with: tukey.outlier = function(x) {   Q1 = quantile(x, 0.25, na.rm = TRUE)   Q3 = quantile(x, 0.75, na.rm = TRUE)   IQR = Q3 - Q1   lower = Q1 - 1.5 * IQR   upper = Q3 + 1.5 * IQR   return(x < lower | x > upper) } tukey_multiple = function(x) {   outliers = array(TRUE, dim = dim(x))   for (j in 1:ncol(x)) {     outliers[, j] = outliers[, j] && tukey.outlier(x[, j])   }   outlier.vec = vector("logical", length = nrow(x))   for (i in 1:nrow(x)) {     outlier.vec[i] = all(outliers[i, ])   }   return(outlier.vec) } Error in outliers[, j] && tukey.outlier(x[, j]) :    'length ...

Module11_Tufte.R

Image
Module 11 Assignment: Tufte Visualizations in R For this assignment, I explored Edward Tufte's principles of data visualization by recreating three different plots from Dr. Piwek's post on "Tufte in R." Tufte emphasizes the importance of the data-ink ratio, which means removing unnecessary elements and only showing what's essential to understand the data. I chose to create three visualizations using the mtcars dataset, which shows the relationship between car weight and fuel efficiency: 1. Dot-Dash Plot in ggplot2 This visualization uses "rug marks" along the bottom and left edges to show where individual data points fall. This is a perfect example of Tufte's minimal design - no grid lines or unnecessary decorations, just the data and simple tick marks showing distribution. 2. Dot-Dash Plot in Lattice I created the same type of plot using R's lattice graphics system. This shows how different R packages can achieve similar results. The rug marks a...