Posts

Showing posts from September, 2025

Assignment 6

Code: # Task 1: Matrix Addition & Subtraction A = matrix(c(2, 0, 1, 3), ncol = 2) B = matrix(c(5, 2, 4, -1), ncol = 2) cat("Matrix A:\n") print(A) cat("\nMatrix B:\n") print(B) A_plus_B = A + B cat("\nA + B:\n") print(A_plus_B) A_minus_B = A - B cat("\nA - B:\n") print(A_minus_B) # Task 2: Create a Diagonal Matrix D = diag(c(4, 1, 2, 3)) cat("\nDiagonal Matrix D:\n") print(D) # Task 3: Construct a Custom 5x5 Matrix first_col = c(3, 2, 2, 2, 2) remaining_block = rbind(   c(1, 1, 1, 1),   diag(3, 4) ) custom_matrix = cbind(first_col, remaining_block) cat("\nCustom 5x5 Matrix:\n") print(custom_matrix) Output: > # Task 1: Matrix Addition & Subtraction > A = matrix(c(2, 0, 1, 3), ncol = 2) > B = matrix(c(5, 2, 4, -1), ncol = 2) > cat("Matrix A:\n") Matrix A: > print(A)      [,1] [,2] [1,]    2    1 [2,]    0    3 > cat("\nMatrix B:\n") Matrix B: > print(B)      [,1] [,2] [1,]   ...

Visualizing Fuel Efficiency Deviation

Image
I created a diverging bar chart showing how each car's MPG compares to the dataset average of 20.09. Blue bars indicate above-average performers, coral bars show below-average ones. The zero line marks the mean, making deviations immediately visible. This chart reveals clear differences between groups: fuel-efficient compact cars cluster on the right (Toyota Corolla at +13.8 MPG) while heavy luxury vehicles dominate the left (Cadillac Fleetwood at 9.7 MPG). The deviation from the benchmark is the chart's core feature: the sorted arrangement and contrasting colors let you spot outliers instantly, which connects to Yau's point about making interesting data points stand out. The color coding does the heavy lifting here, letting viewers categorize performance without reading every number.  The biggest challenge was the overlapping car names along the x-axis; they're hard to read in places. More importantly, the chart shows  which  cars devia...

Module 5- Matrix Operations in R

Matrix Operations in R I completed Module 5, which involved creating matrices in R and testing inverse and determinant operations. Code: A = matrix(1:100, nrow = 10) B = matrix(1:1000, nrow = 10) dim(A) dim(B) det(A) solve(A) det(B) solve(B) detA = det(A) invA = tryCatch(solve(A), error = function(e) e) detB = tryCatch(det(B), error = function(e) e) invB = tryCatch(solve(B), error = function(e) e) Results: Matrix A dimensions: [1] 10 10 (square) Matrix B dimensions: [1] 10 100 (not square) det(A) = 0 solve(A) gave error "Lapack routine dgesv: system is exactly singular: U[6,6] = 0" det(B) gave error "'x' must be a square matrix" solve(B) gave error "'a' (10 x 100) must be square" Explanation:  Matrix A is square, so R can attempt det(A) and solve(A), but the determinant equals 0, meaning the matrix is singular and has no inverse. Matrix B is rectangular (not square), so both det(B) and solve(B) fail because only square matrices can have...

Performance Data Visualization

Image
I analyzed a dataset showing average position values from 0.17 to 1.00 across 20 time periods. This data demonstrates clear progression over time. https://datawrapper.dwcdn.net/SxPyS/1/ The ranking chart shows later time periods achieved higher positions, with period T reaching 1.00 and period A starting at 0.17. https://datawrapper.dwcdn.net/w7mYM/1/ The pie chart reveals most periods performed moderately (grouped as "Other"), while only four periods achieved top performance. Part-to-Whole Reflection Part-to-whole charts clearly show proportions and relationships between components. However, they make it difficult to compare small segments precisely and can hide individual details when data gets grouped into "Other" categories. Design Choice:  I selected a column chart for ranking because it clearly orders performance from highest to lowest, making comparisons easy. For part-to-whole, I chose a pie chart because it shows how individual time periods contribute propo...

US Transit Ridership Analysis By Omar Hamad

My visualization:   https://public.tableau.com/shared/HPQ2J238T?:display_count=n&:origin=viz_share_link Selected Variables From 12 original variables, I chose 6: Year  - for time series Primary USA City  - geographic comparison Ridership  - main metric Primary UZA Sq Miles  - city size context Vehicle Revenue Hours  - service levels Collisions with Motor Vehicle  - safety data These variables show ridership patterns while providing geographic and operational context. Key Findings Massive ridership differences  - Some cities have 250M+ riders while others have under 50M, showing huge variation in transit usage. Stable trends  - Ridership stayed consistent between 2014-2015, suggesting established commuting patterns. Clear tiers  - Data shows distinct groups: major transit hubs, mid-size systems, and smaller regional services. This reveals how differently American cities approach public transportation and suggests ridership reflects lo...

Assignment 4---R-PRO

Hospital Data Analysis Using R This assignment involved analyzing fictional hospital data to practice data cleaning, visualization, and interpretation in R. The dataset contained patient visit frequencies, blood pressure readings, and physician assessments. Data Preparation First, I created the dataset and handled missing values Visualizations Boxplots I created three boxplots to compare blood pressure across different assessments Histograms I also created histograms to show the overall distributions Data Analysis Results The boxplots show that patients with "bad" first assessments had higher blood pressure than those with "good" assessments. This pattern held true for the second assessment and final decision categories as well - patients labeled "high risk" consistently showed elevated blood pressure readings. The data suggests doctors' judgments matched the actual vital signs. The histograms reveal interesting patterns in the data distribution. Visit...

Module 3

Image
Module 3: Refined Map with Color September 14, 2025 Introduction For this assignment, I enhanced my Module 2 Florida COVID-19 map using Adobe Illustrator. The goal was to apply color theory and vector graphics to make the visualization more professional and accessible while preserving the original data integrity. Color Strategy and Readability Rather than rebuilding the entire visualization, I kept the original Tableau color scheme because it already told the data story effectively. The sequential palette from light blue to dark orange creates a clear hierarchy where viewers can instantly identify high-case areas through the warmer, darker colors. This approach maintained my Module 2 work while adding the professional enhancement required for Module 3. Vector Elements Added I used Illustrator's tools to add three key enhancements: Title and Typography : Added "Florida COVID-19 Cases by ZIP Code - April 8, 2020" to provide immediate context and establish the timeframe. Cit...

My Geographic Map of Florida COVID-19 Cases by ZIP Code

Image
Introduction For this mapping exercise, I chose to visualize Florida COVID-19 data from 2020, specifically examining the distribution of cases across ZIP codes throughout the state. This dataset provides a comprehensive view of how the pandemic affected different geographic regions of Florida during the early stages of the outbreak. The map reveals fascinating patterns about population density, urban centers, and the geographic spread of COVID-19 cases across the Sunshine State. Dataset and Process Data Source:  Florida COVID-19 data by ZIP Code(2020). . https://drive.google.com/file/d/1fNyh8DCZAkJuXGEi0XuQGjKpR5lsGPEK/view Challenges Faced My biggest hurdle was Tableau Public repeatedly crashing. Since starting this project on September 1st, the app consistently froze and closed whenever I dragged fields to create the map. After multiple restarts and attempts, it finally worked properly. The dataset used ZIP codes rather than street addresses, but this actually worked better for v...

Importance of Variable Naming - Assignment 2

Image
Today I tackled a function debugging exercise in R that taught me an important lesson about variable scope and naming consistency. I was given a vector of test data and a broken function to fix. The Problem: The issue was inconsistent variable naming. While the function parameter was correctly named  assignment2 , inside the function body, the code referenced  assignment  (missing the "2") and  someData  (a completely different name). R couldn't find these variables because they simply didn't exist.  What I Learned: This exercise reinforced how crucial attention to detail is in programming. A simple typo in a variable name can break an entire function, even when the logic is perfectly sound. It also highlighted the importance of understanding function scope in R - variables inside functions need to either be parameters, locally defined, or accessible from the global environment.