Introduction
I’m working on a project for my Database Systems course. As part of the project, my project partner and I want to be able to connect to our MySQL database and use Weka to train a classifier based on the data found in the database and then use that classifier to make predictions about unseen (future) data. A pretty typical ML exercise. Here’s how it’s done.
For those who don’t know Weka is a machine learning utility and java machine learning library. You can learn about it here: http://www.cs.waikato.ac.nz/ml/weka/ Weka is pretty snazzy in that it allows you to use dozens (perhaps hundreds?) of their machine learning algorithms through a nice Java OOP interface directly in your code or use it as a prototyping/research/study tool right through its GUI and have access to all those algorithms (I’m talking about Support Vector Machines at the click of a button — that’s huge.)
I’m going to do this on Ubuntu 9.10 and I will assume that you already MySQL installed or have remote access to MySQL. The Weka version I am working with is 3.6.
Installing Weka on Ubuntu
At the command prompt in Ubuntu type:
sudo apt-get install weka
This will get Weka installed. (You can now type “weka” at the command line and click on “Explorer” to play with the GUI.)
Installing Eclipse
(Skip if you don’t want to use Eclipse)
Now that Weka is installed, we are going to install Eclipse. In Ubuntu, at the command prompt type: (If you’re using something other than Ubuntu, then follow your OS’s directions instead.)
sudo apt-get install eclipse eclipse-jdt
Installing the MySQL Driver
In Ubuntu type:
sudo apt-get install libmysql-java
This will place into /usr/share/java/mysql-connector-java.jar the jars necessary to talk to MySQL (this is actually a link to the actual jar located in the same directory with the same name + version number.)
Configuring DatabaseUtils.props
This part is very important. Go grab your favorite file unzipper/extractor utility and open /usr/share/java/weka.jar (actually it’s a link to a JAR of the same name with the Weka version number appended) I just use GNOME and point the file browser /usr/share/java/weka.jar from there extract: /weka/experiment/DatabaseUtils.props.mysql. Put this file into your home directory but rename it to: DatabaseUtils.props. Open this file and edit the following lines:
Creating the Project in Eclipse
(Even if you don’t use Eclipse you need to set your CLASSPATH to locations defined at the bottom of this section, so at least do that.)
Click: File -> New -> Java Project
Fill out: Project name:
Click Next
Click on the Libraries tab
Click on Add External JARs…
Browse to /usr/share/java (may differ by OS) and add “mysql-connector-java.jar” and “weka.jar.”
If you’re not using Eclipse make sure to set your CLASSPATH to /usr/share/java/mysql-connector-java.jar and /usr/share/java/weka.jar
(Note: If you’re not using Ubuntu 9.10 and even if you are, make sure these files are where I say they are; they may shift around between versions of Java/Ubuntu/Weka.)
Writing the Java Code
For more details check out: http://weka.wikispaces.com/Use+WEKA+in+your+Java+code
Create a new Java file in the Eclipse project you just created or wherever you’re doing your programming. At the top of your file type the following:
import weka.core.Instances;
import weka.experiment.InstanceQuery;
then in the body of a function type:
InstanceQuery query = new InstanceQuery();
query.setUsername(“nobody”);
query.setPassword(“”);
query.setQuery(“select * from whatsoever”);
// if your data is sparse, then you can say so too
// query.setSparseData(true);
Instances data = query.retrieveInstances();
That code comes from this Wiki: http://weka.wikispaces.com/Use+WEKA+in+your+Java+code if you got this far you should be able to use the Weka wiki to go from here. I will add more to this post as I get further myself. For now this is as far as I’ve gotten 🙂
Good luck!
# JDBC driver (comma-separated list)jdbcDriver=org.gjt.mm.mysql.Driver# database URLjdbcURL=jdbc:mysql://server_name:3306/database_name
6 Comments
You might be interested to take a look at the collection of tutorials and videos on MYSQL.
Tutorials:
http://www.dataminingtools.net/browsetutorials.php?tag=mys
Videos:
http://www.dataminingtools.net/videos.php?id=5
You might also take a look at the collection of tutorials and videos on WEKA.
Tutorials: http://www.dataminingtools.net/browsetutorials.php?tag=weka
Videos: http://www.dataminingtools.net/videos.php?id=6
Hope this helps.
Hey Thanks for this post very helpful. Trying to analysis on a auction site im working on, and this was perfect!
Thanks … I m going to use it in netbeans and windows
Thanks! Just want to start a project in nearly the same setting~
Hello,
I have students project and I use Weka to find rules with apriori algorithm.
I have a problem connecting Weka to MySQL on Windows.
Do you have any ideas or is it similar to your work listed here?
Thanks,
Nevena