What are the requirements for Data Mining Jobs

To understand the requirements of data mining jobs, I created a tag cloud using these steps:

  1. Used Yahoo Pipes (I created mine, but this one has more feeds)– this pipe aggregates feeds from different job web-sites, and gives the user unique job listing that you can subscribe via RSS: Job Feed Aggregator by Sean Dolan
  2. Subscribed to the RSS feed for the keyword “data mining”
  3. Copied the job descriptions and requirements of many jobs, and saved the text file
  4. Got the python stemmer
  5. Applied the python stemmer to the text file. Stemmer truncates words to their roots, so that we can combine variants of a word into a single word. (First or second step in text mining)
  6. Created a tag cloud using the services of http://www.wordle.net/ . They use “stop words,” so I didn’t have to apply those. Stop words are common words, which necessarily don’t add any value for categorization, of a language.
Data Mining Jobs Tag Cloud

Data Mining Jobs Tag Cloud

The most frequent word is: experience. Companies want people with experience in different data mining techniques. You’ll see that some other big words are: SAS (stemmed as sa), Excel, SQL, analytical skills, statistics, and quantitative skills.

And how do you master these skills, you ask?

  1. Get a graduate degree in statistics, economics, mathematics, computer science, financial engineering, or industrial engineering with emphasis on databases, data mining, and marketing.
  2. Successfully complete data mining projects using free, open-source data mining tools, such as Weka, R, Orange, Rapid-Miner.
  3. Participate in data mining competitions on Kaggle and KDD. SAS’s data mining conference has a data mining competition every year.

A detailed study by Pejic Bach describes various data mining jobs and provides a table of expectations from candidates: Creating profile of data mining specialist

About the Author

The author of Tableau Data Visualization Cookbook and an award winning keynote speaker, Ashutosh R. Nandeshwar is one of the few analytics professionals in the higher education industry who has developed analytical solutions for all stages of the student life cycle (from recruitment to giving). He enjoys speaking about the power of data, as well as ranting about data professionals who chase after “interesting” things. He earned his PhD/MS from West Virginia University and his BEng from Nagpur University, all in industrial engineering. Currently, he is leading the data science, reporting, and prospect development efforts at the University of Southern California.