Analytical tools

It takes more than a few tools to be successful at this craft. Data science is at the interface of algebra, statistics and computer science. Visualizations and dashboards require proficiency with web development. Modern databases and data storage inevitably leads to cloud technologies.

Tools of the trade

Data science and machine learning fall between several disciplines:

Each field has subfields, along with specific tools. Statistics for example includes frequentist and Bayesian, as well as statistical learning. Each subfield requires a set of practical tools, often software libraries, to perform computations. In our case, those tools are all Python libraries. Python is a "glue" language, that works for most applications. From web apps to advanced neural networks. Speaking of web, we also provide full-stack web development capability, especially for data-driven applications like cloud-based dashboards, visualizations and real-time databases. The backend is typically programmed with Python Django and the front end with Javascript-SASS, using Javascript libraries like D3.js and Plotly. Listing every tool of the trade would be very lengthy, and quite honestly boring, so here's a dendrogram of the main ones, made with D3.js.

Relevant skillsMachine LearningSupervisedLinear RegressionLogistic regressionRandom ForestsDecision TreesSupport Vector MachinesK nearest NeighborsKernel methodsElastic netGaussian processesNeural networksUnsupervisedK-meansPrincipal Components AnalysisIndependent Components AnalysisMean-shiftT-SNEGaussian Mixture ModelsPythonNumpyPandasScikit-LearnTensorflowKerasPyMC3StatisticsFrequentistMaximum likelihoodBootstrapDescriptive statsHypothesis testingConfidence intervalst-testsz-testschi-square testsANOVAComputational testingBayesianBayesian UpdatingMarkov Chain Monte CarloCredible intervalsBayesian networksDatabasesSQLPostgresMySQLPsycopg2NoSQLMongoDBFirebaseMongooseMongothonVisualizationPythonMatplotlibSeabornBokehCufflinksJSD3.jsPlotly Relevant skillsMachine LearningSupervisedLinear RegressionLogistic regressionRandom ForestsDecision TreesSupport Vector MachinesK nearest NeighborsKernel methodsElastic netGaussian processesNeural networksUnsupervisedK-meansPrincipal Components AnalysisIndependent Components AnalysisMean-shiftT-SNEGaussian Mixture ModelsPythonNumpyPandasScikit-LearnTensorflowKerasPyMC3StatisticsFrequentistMaximum likelihoodBootstrapDescriptive statsHypothesis testingConfidence intervalst-testsz-testschi-square testsANOVAComputational testingBayesianBayesian UpdatingMarkov Chain Conte CarloCredible intervalsBayesian networksDatabasesSQLPostgresMySQLPsycopg2NoSQLMongoDBFirebaseMongooseMongothonVisualizationPythonMatplotlibSeabornBokehCufflinksJSD3.jsPlotly Relevant skillsMachine LearningSupervisedLinear RegressionLogistic regressionRandom ForestsDecision TreesSupport Vector MachinesK nearest NeighborsKernel methodsElastic netGaussian processesNeural networksUnsupervisedK-meansPrincipal Components AnalysisIndependent Components AnalysisMean-shiftT-SNEGaussian Mixture ModelsPythonNumpyPandasScikit-LearnTensorflowKerasPyMC3StatisticsFrequentistMaximum likelihoodBootstrapDescriptive statsHypothesis testingConfidence intervalst-testsz-testschi-square testsANOVAComputational testingBayesianBayesian UpdatingMarkov Chain Monte CarloCredible intervalsBayesian networksDatabasesSQLPostgresMySQLPsycopg2NoSQLMongoDBFirebaseMongooseMongothonVisualizationPythonMatplotlibSeabornBokehCufflinksJSD3.jsPlotly