Examining the task schema for the knn learner indicates that it requires instance training data to be presented as arrays of numbers. If we wish to use this learner to classify irises, we need to adapt the structure of the iris relation’s instance values to the needs of the learner. So as a first step in constructing a learning task the information provided by the default attribute needs to be re-organised to present the iris dimensions as an array of numbers. This can be done by creating a new attribute using a create request sent via a POST to the iris relation:

POST http://example.org/data/iris
{
    "psiType":     "attribute-definition",
    "description": "A feature vector representation of iris dimensions",
    "attribute": [
        "http://example.org/data/iris/flower/sepal/length",
        "http://example.org/data/iris/flower/sepal/width",
        "http://example.org/data/iris/flower/petal/length",
        "http://example.org/data/iris/flower/petal/width"
    ]
}
201 Created
Location: http://example.org/data/iris/array1

A GET request to the new URI reveals the schema of values emitted by the newly created attribute:

GET http://example.org/data/iris/array1
200 OK
{
    "psiType":      "attribute",
    "uri":          "http://example.org/data/iris/array1",
    "description":  "A feature vector representation of iris dimensions",
    "relation":     "http://example.org/data/iris",
    "emits":        { "$array": { "items": [ "$number", "$number", "$number", "$number" ] } },
    "subattributes": [
        "http://example.org/data/iris/flower/sepal/length",
        "http://example.org/data/iris/flower/sepal/width",
        "http://example.org/data/iris/flower/petal/length",
        "http://example.org/data/iris/flower/petal/width"
    ],
    "querySchema": { ... }
}

The above response shows that the schema for the newly created attribute is an array of numbers. The list of sub-attributes refers to the existing attributes used to define the new attribute in its creation request. Note that array-valued attributes present their sub-attributes in an array, while object-valued attributes (such as the original flower attribute) present them in an object, so that in each case the sub-attribute responsible for generating part of the structured value can be identified.

Training a predictor

Since the newly created attribute returns arrays of numbers, it is valid for the source part of the task schema in the knn learner’s description. The following process request, which omits the relation property since all instances are to be used, can be sent to the knn learner via a POST:

POST http://example.org/learner/knn
{
    "psiType": "task",
    "task": {
        "k": 3,
        "resources": {
            "source":  "$http://example.org/data/iris/array1",
            "target":  "$http://example.org/data/iris/flower/species"
        }
    }
}
201 Created
Location: http://example.org/infer/knn_iris_20130802180742606

K-nearest neighbour algorithms are “lazy” algorithms in the sense that training consists of merely memorising the instances. This means the training time for this task is very short and so the learner is able to return the HTTP status code 201 (Created) and the URI for the predictor immediately, rather than the HTTP status 202 (Accepted).

Next steps

Let’s inspect the newly created predictor and see what it can do.

Discovering available learning algorithms Examining a predictor & making simple predictions