KDNuggets examines top data science machine learning tools. With considerable data visualizations at the link.
Tags: Anaconda, Apache Spark, Big Data Software, Deep Learning, Excel, Keras, Poll, Python, R, RapidMiner, scikit-learn, Software, SQL, Tableau, TensorFlow
We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and
By Gregory Piatetsky, KDnuggets.
Recently we reported the results of 20th annual KDnuggets Software Poll:
Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis.
As we have done before (see 2017 data science ecosystem, 2018 data science ecosystem), we examine which tools were part of the same answer - the skillset of the user. We note that this does not necessarily mean that all tools were used together on each project, but having knowledge and skills to used both tools X and Y makes it more likely that both X and Y were used together on some projects. The results we see are consistent with this assumption.
The top tools show surprising stability - we see essentially the same pattern as last year.
First, we selected the tools with at least 20% of the vote. There were 11 such tools - exactly the same list of 11 tools as last year, although the order has changed a little. Keras moved up from n. 10 to n. 8, and Anaconda moved up from n. 6 to n. 5. Tableau and SQL moved down a little.
The cutoff for this group of 11 is a natural one, since there is a big gap between n. 11 (Apache Spark, with 21%) and n. 12 (Microsoft Power BI, 13%).
We used the same Lift measure as in our 2017 analysis and 2018 analysis.
We then grouped together the tools with the strongest association, starting with Tensorflow and Keras, until we arrived to the figure 1 below. We made the patterns easier to see by showing only associations with abs(Lift1) > 15%. ... "
Monday, June 17, 2019
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment