Skip to content

Add a PR section for Evans in the README.#1

Open
ericgitangu wants to merge 3 commits into
mainfrom
kwanza_tukule_case_study_assessment_evans_pr_branch
Open

Add a PR section for Evans in the README.#1
ericgitangu wants to merge 3 commits into
mainfrom
kwanza_tukule_case_study_assessment_evans_pr_branch

Conversation

@ericgitangu
Copy link
Copy Markdown
Owner

Pull Request: Kwanza Tukule Data Analyst Assessment Submission

Summary

This pull request contains the completed submission for the Kwanza Tukule Data Analyst Assessment. The solution meets all the outlined criteria, providing insights and actionable recommendations based on the given dataset. Below is a breakdown of the work completed and how it aligns with the assignment's requirements.


Criteria and Achievements

Section 1: Data Cleaning and Preparation (20 points)

  • Criteria: Inspect the dataset for missing values, duplicates, and inconsistent data types. Create a Month-Year column.
  • Achievements:
    • Missing values and duplicates were identified and addressed.
    • A Month-Year column was successfully added using feature engineering.
    • Validation:
      • Logs are color-coded for clarity in the console.
      • Tests confirm the integrity of data cleaning and feature engineering.

Section 2: Exploratory Data Analysis (30 points)

  • Criteria: Provide insights into total quantity and value by:
    • Category
    • Business
    • Trends over time
  • Achievements:
    • Aggregated sales by anonymized category and business.
    • Time-series analysis shows trends in sales over time.
    • Visualizations:
      • Bar charts for category and business analysis.
      • Line chart for sales trends over time.
    • Validation:
      • Tests confirm the correctness of calculations and visualizations.

Section 3: Advanced Analysis (30 points)

  • Criteria:
    • Segment businesses based on purchasing behavior.
    • Forecast total sales for the next three months.
    • Detect anomalies in sales data.
  • Achievements:
    • Customer segmentation classifies businesses into high, medium, and low-value groups.
    • Anomaly detection identifies unusual sales patterns.
    • Validation:
      • Tests validate the segmentation logic and ensure correct groupings.

Section 4: Strategic Insights and Recommendations (20 points)

  • Criteria:
    • Recommend product strategies, customer retention approaches, and operational efficiencies.
    • Document these insights.
  • Achievements:
    • Generated recommendations for:
      • Product strategy based on top-performing categories.
      • Customer retention strategies for declining businesses.
      • Operational improvements for inventory optimization.
    • Recommendations are output to console and saved in:
      • outputs/product_strategy.txt
      • outputs/customer_retention.txt
      • outputs/operational_efficiency.txt
    • Validation:
      • Tests confirm the creation and correctness of these files.

Section 5: Dashboard and Reporting (20 points)

  • Criteria:
    • Create an interactive dashboard summarizing:
      • Sales by category and business.
      • Time-series trends.
      • Segmentation summaries.
  • Achievements:
    • Built an interactive dashboard using plotly.express with:
      • Bar and line charts.
      • Segmentation summaries.
    • Validation:
      • Data preparation steps for the dashboard are tested.

Bonus Section: Open-Ended Problem (10 points)

-Criteria:

  • Address scalability for a larger dataset.
  • Suggest predictive analysis techniques.
  • Achievements:
    • Discussed scalability improvements using distributed storage and indexing.
    • Proposed predictive analysis techniques such as ARIMA and ML models.
    • Bonus insights are saved in outputs/bonus_questions.txt.

Testing and Validation

  • Automated tests were implemented using pytest to validate:
    • Data loading, cleaning, and feature engineering.
    • Sales overview and trends analysis.
    • Recommendation generation and output file creation.
  • Command to run the whole pipeline
    python3 src/kwanza_tukule_analysis.py

Watch the console output, file outputs and also the browser visualizations for the assessments requirements.

  • Command to run tests:
        pytest tests/ --tb=short --disable-warnings

Achievements Summary

This submission addresses all required sections with the following highlights:

  • Cleaned and prepared dataset with missing values and duplicates handled.
  • Performed advanced analytics, including segmentation and anomaly detection.
  • Generated actionable recommendations and interactive visualizations.
  • Provided outputs in the form of files and a dashboard.
  • Implemented robust testing to validate functionality.

Notes

  • All output files for submission are located in the strategic_insights_recommendations/ and bonus_questions/ directory.
  • Dashboard generation relies on plotly.express, which renders visuals in the browser.
  • Please feel free to suggest further optimizations or improvements!

Thank you for reviewing this submission Evans! 🙏

@ericgitangu ericgitangu force-pushed the kwanza_tukule_case_study_assessment_evans_pr_branch branch from f01bac4 to a03b2df Compare January 21, 2025 16:35
@ericgitangu
Copy link
Copy Markdown
Owner Author

@evansonbiwot here's a summary of the achievements in addition to the README file

@ericgitangu ericgitangu self-assigned this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant