Skip to article frontmatterSkip to article content

Data Files

City University of Hong Kong

Load data into Weka Explorer Interface

How to start Weka in JupyterHub?

  • Open the Launcher (File->New Launcher)
  • Start a Desktop from the Launcher.
  • Start a Terminal from the menu on the top left.
  • Run the command weka and click the Explorer button.
  • Load data from the folder /data/.
Other methods to run Weka
  • To run Weka locally on your computer:
    1. Download and install Weka from here, and
    2. obtain the data files from
      1. the subfolder data of your Weka installation path, or
      2. download from here.
  • For the computers in CSC teaching studios, you can start Weka as follows:
    1. Click the shortcut Work Desk from desktop.
    2. Click the link Weka 3.8.x for CS Department.
    3. Load the dataset from
      C:\Program Files\Weka-3-8\data\.
  • For the computers in CS labs, you can start Weka as follows:
    1. Execute G:\weka\3.8\run.bat,
    2. Click the Explorer button, and
    3. Load the dataset from
      C:\temp\Weka-3.8\data or G:\weka\3.8\files\data.

Use Weka to do [Witten11] Exercises 17.1.1 and 17.1.2..

YOUR ANSWER HERE

YOUR ANSWER HERE

Create an ARFF file

Create an ARFF file named AND.arff in the current directory for the AND gate Y=X1X2Y=X_1\cdot X_2. Use 0 and 1 to represent False and True respectively.

# YOUR CODE HERE
raise NotImplementedError()

# write the content of text to the file
try: content
except NameError: 
    print("AND.arff not generated because `content` is undefined.")
else:
    filename = 'AND.arff'
    with open(filename,'w') as f:
        f.write(content)
    print("AND.arff generated.")

Run the following test cell to see if your file is a valid ARFF file. You may also download and load the ARFF file into WEKA to see if there is any syntax error.

# test
print('Content of AND.arff:')
with open(filename) as f:
    print(f.read())

from scipy.io import arff
import pandas as pd

d = arff.loadarff(filename)
df = pd.DataFrame(d[0]).astype(int)
df.head()