Exploring Inc. Magazine’s 2018 fastest growing private companies in America


Data Visualization Website: https://inc2018-dataviz-tj.herokuapp.com
GitHub

incpic

Exploring Inc. Magazine’s 2018 5000 fastest growing private companies in United States


Data Source

  • Inc. Magazine has published the fastest growing private companies ranking list every year. The full data sets are hosted in the data.world.

Project Overview

  • Data Storage : PostgreSQL
  • Workflow Engine (WFE): Flask Web Server/SQLAchemy/Python
  • Web Application/GUI : HTML/CSS, JavaScript,D3,Leaflet.js
  • Production Deployment on Heroku.com: (https://inc2018-dataviz-tj.herokuapp.com)
  • Product : Interactive Web Data Journalism Visualization, JSON format API data for INC 5000 data

1. Data Extract and Load

  • CSV formatted Data downloaded from the data.world
  • Local :Using python/SQLAchemy/psycopg2 to Extract out(GitHub)/Load into(GitHub) the PostgreSQL database
  • Deployment: Will use the PostgreSQL DB on the heroku.com. The database initialization script(initdb.py) is here
  • Flask Webserver will provide the JSON format API data

2. Workflow Engine and JSON API format

  • Using the Flask Web server/SQLAchemy/Python to create the API route and JSON data for data visualization
  • Flask API JSON Data Route: (app.py)
    • @app.route(“/2018metadata”)
      Return Full Inc2018 5000 JSON Metadata

    • @app.route(“/2018metadata/pages/ < num >”)
      Return Inc2018 5000 JSON Metadata by page, when num=0 return full data

    • @app.route(“/2018metadata/ < filter_name >”)
      Return filtered Inc2018 5000 JSON Metadata

    • @app.route(“/2018metadata/plot/ < plot_name >”)
      Return filtered Inc2018 5000 JSON Metadata for plotting

    • @app.route(“/rank/ < ranking_number >”)
      Return ranking query JSON data

    • @app.route(“/state_s/ < state_s >”)
      Return State query JSON data

    • @app.route(“/state_l/ < state_l > /< page_num >”)
      Return State long name data by page

    • @app.route(“/city/< city > / < page_num >”)
      Return city data by page

    • @app.route(“/years_on/ < yrs_on_list >”)
      Return years on the list query JSON data

    • @app.route(“/years_on/ < yrs_on_list > / < page_num >”)
      Return years on the list query JSON data by page

    • @app.route(“/founded_year/< founded >”)
      Return founded year query JSON data

    • @app.route(“/founded_year/< founded>/") Return founded year query JSON data by page

    • @app.route(“/industry/< industry>/") Return industry data by page

    • @app.route(“/industry_growth_rev”)
      Return growth/revenue data groupby industry

    • @app.route(“/topten_cities”)
      Return top ten cities

    • @app.route(“/topten_companies”)
      Return the top ten companies

    • @app.route(“/growth_rev_state”)
      Return growth/revenue data groupby states

  • API JSON Data Format
    json_format

3. Data Visualization

  • Explore the Geo-location relation with the fastest growing private companies.
    • States views of the companies
    • City views of companies
    • Industry-related views of companies
    • Revenues-related views of companies
    • Distribution companies by different filters
  • Explore the individual company information
    • Visualization the company basic information
    • Headcount/Revenue/Years on the list/CEO
    • Website information

4. Data Visualization Website and Dashboard

  • Dashboard
    • Overview the whole dataset:Geo Information/Industry Information/Top Cities and Top companies
    • Full List dash
  • Company Profiles
    • Choose the rank/city/state/founded year and years-on-list to get detailed information profiles
  • Industry Charts
    • Choose the industry sector/city/state/founded year and years-on-list to get industry-based growth and revenue information industry
  • Location Charts
    • Choose the city/state/founded year and years-on-list to get loaction-based information location
  • Table List
    • Choose the industry/city/state/founded year and years-on-list to full raw data table by page table
  • Maps
    • Display all Geo information and related Growth and Revenue information map

#### 5. Deployment Notes

  • Initialization Database: Set the primary key before the deployment.

    - $heroku pg:pgql
    - $ALTER TABLENAME ADD PRIMARY KEY (key_name);
       
    
  • PostgreSQL Heroku deployment

    Create postgreSQL db
    - $heroku addons:create heroku-postgresql:hobby-dev
    - $heroku pg:info
    Push local DB into Heroku
    -$ heroku pg:push mylocaldb HEROKU_POSTGRESQL_MAGENTA --app APPNAME
    
  • Connecting to Python
    • To use PostgreSQL as your database in Python applications you will need to use the psycopg2 package.
    import os
    import psycopg2
    DATABASE_URL = os.environ['DATABASE_URL']
    conn = psycopg2.connect(DATABASE_URL, sslmode='require')
    
  • Fully Test on all browsers: Safari/Chrome/Firefox/IE to solve the compatiable issues

  • Git Process
   - git add .
   - git commit -am "note"
   - git push
   - git push heroku master