Skip navigation

Introduction

I’m working on a project for my Database Systems course. As part of the project, my project partner and I want to be able to connect to our MySQL database and use Weka to train a classifier based on the data found in the database and then use that classifier to make predictions about unseen (future) data. A pretty typical ML exercise. Here’s how it’s done.

For those who don’t know Weka is a machine learning utility and java machine learning library. You can learn about it here: http://www.cs.waikato.ac.nz/ml/weka/ Weka is pretty snazzy in that it allows you to use dozens (perhaps hundreds?) of their machine learning algorithms through a nice Java OOP interface directly in your code or use it as a prototyping/research/study tool right through its GUI and have access to all those algorithms (I’m talking about Support Vector Machines at the click of a button — that’s huge.)

I’m going to do this on Ubuntu 9.10 and I will assume that you already MySQL installed or have remote access to MySQL. The Weka version I am working with is 3.6.

Installing Weka on Ubuntu

At the command prompt in Ubuntu type:

sudo apt-get install weka

This will get Weka installed. (You can now type “weka” at the command line and click on “Explorer” to play with the GUI.)

Installing Eclipse

(Skip if you don’t want to use Eclipse)
Now that Weka is installed, we are going to install Eclipse. In Ubuntu, at the command prompt type: (If you’re using something other than Ubuntu, then follow your OS’s directions instead.)

sudo apt-get install eclipse eclipse-jdt

Installing the MySQL Driver

In Ubuntu type:

sudo apt-get install libmysql-java

This will place into /usr/share/java/mysql-connector-java.jar the jars necessary to talk to MySQL (this is actually a link to the actual jar located in the same directory with the same name + version number.)

Configuring DatabaseUtils.props

This part is very important. Go grab your favorite file unzipper/extractor utility and open /usr/share/java/weka.jar (actually it’s a link to a JAR of the same name with the Weka version number appended) I just use GNOME and point the file browser  /usr/share/java/weka.jar from there extract: /weka/experiment/DatabaseUtils.props.mysql. Put this file into your home directory but rename it to: DatabaseUtils.props. Open this file and edit the following lines:

# JDBC driver (comma-separated list)
jdbcDriver=org.gjt.mm.mysql.Driver
# database URL
jdbcURL=jdbc:mysql://server_name:3306/database_name
server_name should be changed to your MySQL server (for example, ‘localhost’ or ‘dbase.cs.school.edu.org’) and database_name should be changed to the database you want to use.
In this file there will also be things like: “# string, getString() = 0;    –> nominal”
I haven’t exactly figured out what’s going on here but if you’re going to be using varchar(N) in your database tables you need to add the following line to this table:
VARCHAR=0 #that’s a zero not an “oh”
And if you’re using INT (int) then add this line too:
INT=5
etc…
Don’t forget to save.
See here for more details: http://weka.wikispaces.com/Databases

Creating the Project in Eclipse

(Even if you don’t use Eclipse you need to set your CLASSPATH to locations defined at the bottom of this section, so at least do that.)
Click: File -> New -> Java Project

Fill out: Project name:

Click Next

Click on the Libraries tab

Click on Add External JARs…

Browse to /usr/share/java (may differ by OS) and add “mysql-connector-java.jar” and “weka.jar.”

If  you’re not using Eclipse make sure to set your CLASSPATH to /usr/share/java/mysql-connector-java.jar and /usr/share/java/weka.jar

(Note: If you’re not using Ubuntu 9.10 and even if you are, make sure these files are where I say they are; they may shift around between versions of Java/Ubuntu/Weka.)

Writing the Java Code

For more details check out: http://weka.wikispaces.com/Use+WEKA+in+your+Java+code

Create a new Java file in the Eclipse project you just created or wherever you’re doing your programming. At the top of your file type the following:

import weka.core.Instances;
import weka.experiment.InstanceQuery;

then in the body of a function type:

InstanceQuery query = new InstanceQuery();
query.setUsername(“nobody”);
query.setPassword(“”);
query.setQuery(“select * from whatsoever”);
// if your data is sparse, then you can say so too
// query.setSparseData(true);
Instances data = query.retrieveInstances();

That code comes from this Wiki: http://weka.wikispaces.com/Use+WEKA+in+your+Java+code if you got this far you should be able to use the Weka wiki to go from here. I will add more to this post as I get further myself. For now this is as far as I’ve gotten 🙂

Good luck!

# JDBC driver (comma-separated list)jdbcDriver=org.gjt.mm.mysql.Driver# database URLjdbcURL=jdbc:mysql://server_name:3306/database_name

6 Comments

  1. You might be interested to take a look at the collection of tutorials and videos on MYSQL.
    Tutorials:
    http://www.dataminingtools.net/browsetutorials.php?tag=mys
    Videos:
    http://www.dataminingtools.net/videos.php?id=5

  2. You might also take a look at the collection of tutorials and videos on WEKA.
    Tutorials: http://www.dataminingtools.net/browsetutorials.php?tag=weka
    Videos: http://www.dataminingtools.net/videos.php?id=6

    Hope this helps.

  3. Hey Thanks for this post very helpful. Trying to analysis on a auction site im working on, and this was perfect!

  4. Thanks … I m going to use it in netbeans and windows

  5. Thanks! Just want to start a project in nearly the same setting~

  6. Hello,
    I have students project and I use Weka to find rules with apriori algorithm.
    I have a problem connecting Weka to MySQL on Windows.
    Do you have any ideas or is it similar to your work listed here?

    Thanks,
    Nevena


Leave a comment