To understand the requirements of data mining jobs, I created a tag cloud using these steps:
- Used Yahoo Pipes (I created mine, but this one has more feeds)– this pipe aggregates feeds from different job web-sites, and gives the user unique job listing that you can subscribe via RSS: Job Feed Aggregator by Sean Dolan
- Subscribed to the RSS feed for the keyword “data mining”
- Copied the job descriptions and requirements of many jobs, and saved the text file
- Got the python stemmer
- Applied the python stemmer to the text file. Stemmer truncates words to their roots, so that we can combine variants of a word into a single word. (First or second step in text mining)
- Created a tag cloud using the services of http://www.wordle.net/ . They use “stop words,” so I didn’t have to apply those. Stop words are common words, which necessarily don’t add any value for categorization, of a language.
The most frequent word is: experience. Companies want people with experience in different data mining techniques. You’ll see that some other big words are: SAS (stemmed as sa), Excel, SQL, analytical skills, statistics, and quantitative skills.
And how do you master these skills, you ask?
- Get a graduate degree in statistics, economics, mathematics, computer science, financial engineering, or industrial engineering with emphasis on databases, data mining, and marketing.
- Successfully complete data mining projects using free, open-source data mining tools, such as Weka, R, Orange, Rapid-Miner.
- Participate in data mining competitions on Kaggle and KDD. SAS’s data mining conference has a data mining competition every year.
A detailed study by Pejic Bach describes various data mining jobs and provides a table of expectations from candidates: Creating profile of data mining specialist
