#!/usr/bin/env python # coding: utf-8 # Open in Colab
# [![nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/4dsolutions/clarusway_data_analysis/blob/main/DVwPY_S7/Recap_and_Capstone_Concepts.ipynb) # # --- # #

CLRSWY

# # ##

WAY TO REINVENT YOURSELF

# # ##

Recap and Capstone Example

# clarusway_path # # ##

Final Session

# Our Clarusway curriculum takes a cumulative path through the Python ecosystem, as we were discussing yesterday as we entered recap and review mode. We are still in that mode today. # # This sessions continues our recap [started in a previous session](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DVwPY_S6/6-DVwPy_Review_Overview_Preview.ipynb) and draws upon everything we have learned so far. # # We started in [DAwPy (Data Analysis with Python)](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DAwPy_S1_(Numpy_Arrays)/daily_schedule.ipynb) by introducing numpy and rolling forward through pandas to where we were rubbing up against: # # * Data Visualization (starting to import matplotlib) # * Machine Learning ([last Session of DAwPy](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DAwPy_S12_(Recap-EDA%20on%20Titanic%20Dataset)/DAwPy-S12%20(Recap-EDA%20on%20Titanic%20Dataset).ipynb)) # # In [DVwPy (Data Visualization with Python)](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DVwPY_S1/daily_schedule.ipynb) we explored: # # * plotting with matplotlib # * plotting from pandas # * seaborn # * plotly and dash (with mention of [bokeh](https://docs.bokeh.org/en/latest/index.html)) # # The transition from Seaborn to plotly, [introduced yesterday](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DVwPY_S6/6-DVwPy_Interactive%20Plots%20with%20Plotly%20(A%20Beginner%20Friendly%20Guide).ipynb), was not intended to impart all of plotly in a single day, after taking three days sailing in the Seaborn sea. # # Plotly Express is as high level as Seaborn, but then plotly is able to work at a lower more detailed level, akin to how we use matplotlib. # # However, we saw the leverage we now have as data analysts familiar with Seaborn, to apply the same concepts, such as the visualization types (violin, boxplot, swarm), across platforms. Our ability to ascend this new learning curve may indeed have increased, thanks to the transferrability of the ideas. # # Here's a screenshot of an exhibit we arrived at yesterday, plotting Power Plants around the world on an orthographic projection, with embedded tools such as zoom and pan. #
# Screen Shot 2023-01-16 at 7.10.30 PM #
# ### Primary Skill Sets # # The skillsets we have been most developing in the foreground encompass: # # * [Jupyter](https://jupyter.org/) -- how we develop and publish our content # * [numpy](https://numpy.org/) -- the number crunching king of vectorized operations # * [matplotlib](https://matplotlib.org/) -- [pyplot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html) imitates Matlab's way of doing things # * [pandas](https://pandas.pydata.org/) -- sophisticated DataFrames, isomorphic to spreadsheets* # * [seaborn](https://seaborn.pydata.org/#) -- augmented data visualization, beyond anything up to now # * [plotly](https://plotly.com/python/) -- our final sessions dive into this visualization alternative # * [python](https://python.org/) -- Python itself has more power, as one learns to tap it # * [Regexes](https://regex101.com/) -- pattern matching, a standalone mini-language incorporated by Python # * [$\LaTeX$](https://www.latex-project.org/) -- we use it to share math typography in a Jupyter Notebook context # # ### Secondary Skill Sets # # Secondarily, we have been reviewing the relevance of: # # * [Git](https://git-scm.com/) # * [git bash](https://www.atlassian.com/git/tutorials/git-bash) # * [Github](https://github.com/) # * [nbviewer](https://nbviewer.org/) # * [nbconvert](https://nbconvert.readthedocs.io/en/latest/install.html#installing-tex) # * [SQLite](https://www.sqlite.org/index.html) # * [Anaconda](https://www.anaconda.com/download) # * [conda](https://docs.conda.io/en/latest/) # * [pip](https://pip.pypa.io/en/stable/index.html) # * [PyPi](https://pypi.org/) # * [Flash](http://thekirbster.pythonanywhere.com/) and [Dash](./dash_app/opiod_epidemic.py) # * [Docker](https://www.docker.com/) # # **Github and Your Online Portfolio** # # You're free to clone the content of public repos, perhaps after installing Git for Windows. We went through demonstrations of cloning: # # * [this curriculum](https://github.com/4dsolutions/clarusway_data_analysis) # * [the Data Science Handbook by Jake VanderPlas](https://github.com/jakevdp/PythonDataScienceHandbook) # # You're also free to create *your own* free public repos and start populating them with your own public content. # # What blend of skills would you like to showcase? # # In what formats would you like to demonstrate your expertise. May we suggest the Jupyter Notebook (ipynb files). # # **Background:** # # Since the open source revolution, many enterprises have gravitated to a base layer of communally shared products and projects, such as Python and SQL Alchemy, on top of which a proprietary non-public layer is added. # # Sometimes the non-public value added is not only source code, but also data. Even with a communally shared ecosystem, there's still plenty of room for private enterprising. # # PayPal calls it "InnerSource": using open source tools and workflows in a proprietary environment. This is how many and probably most internet enterprises are constituted: using a combination of open and closed tools. # # As a result of these developments, those of us wishing to join these enterprises have an opportunity to learn some of their tools (the open source ones) without investing in expensive license agreements or taking on the overhead of private ownership. # # Finally, we have been learning about: # # * [sources of data sets](DataSources.ipynb) and how to use them # * sources of leaning materials and how to use them # * ways of running our Notebooks in the cloud # - on [Google Colab](https://colab.research.google.com/) # - in a [Dash / Plotly](https://dash.plotly.com/) account # - on [Kaggle](https://www.kaggle.com/) # - with [Binder](https://mybinder.org/) # * the continuing relevance of IDEs # - [Spyder](https://docs.spyder-ide.org/current/index.html) # - [IDLE](https://docs.python.org/3/library/idle.html) # - [PyCharm](https://www.jetbrains.com/pycharm/) # - [vscode](https://code.visualstudio.com/) # * [the Capstone Project](https://github.com/4dsolutions/clarusway_data_analysis/blob/main/DVwPY_S7/CapStone_Example_Bike_Sharing.ipynb) genre (one of today's topics) #

CLRSWY

# # ##

WAY TO REINVENT YOURSELF

# # ___