Friday, March 23, 2012

Question about Neural Networks

I created a test table (name - "Nset") with the columns:
id (int), n1 (float), n2 (float), n3 (float) and c1 (varchar).
Then filled a table the followings information:
id n1 n2 n3 c1

1 0,1 0,1 0,6 one

2 0,2 0,1 0,5 one

3 0,7 0,5 0,1 two

4 0,4 0,9 0,3 two

5 0,5 0,1 0,5 three

And created a neural network with tuning by default. "id"-field is the key. n1, n2 and n3 are inputs. c1 - predict.

Then i tryed predict query, like:

SELECT

PREDICT([Nset].[c1])

FROM

[Nset]

NATURAL PREDICTION JOIN

(SELECT 0,5 AS [n1], 0,1 AS [n2], 0,5 AS [n3]) AS t

The result is "three". This is correct. And some other tests appeared correct.

But, when I filled the column c1 with numerical values (one = 1, two=2, three=3) and changed type to int, a predict query left off to work correctly.

Previous query return 4.

And other tests showed that a value returned large on unit.

Is this correct?

Thanks.

Hello

It can be explained. The algorithm actually works correctly, the models are different.

Here is what happens:

first case: "one", "two", "three" are strings, so the mining structure / model creation wizard decides that they should be treated as discrete. The net is trained to fit particular states of a distinct attribute (c1). State "three" is only described by one training point (the last row) so the weights of the network are optimized (actually, over-trained) to predict "three" for the combination of inputs that appears in the last row. This is why the prediction result is "three"

second case: 1, 2 and 3 are this time numbers. The wizard decides that, as numeric values, they should be treated as Continuous. You can change this behavior (the wizard only makes a suggestion) in one of the wizard pages. As the target is now a continuous variable, the network is not optimized independently for the distinct state '3'. It actually is optimized for the whole surface defined by the training points.

You can fix it immediately by changing, in the wizard, the content type for the target column, c1, from Continuous to Discrete.

The differences between results should reduce as more training cases are presented to the network

Hope this clarifies the issue

|||Thanks! Now working properly!|||

And how it is better to work with variable neurons number?

It is similar to the previous case, but the table in a following kind:

case_id

input_name

input_weight

output_val

1

n1

0,1

1

1

n2

0,1

1

1

n3

0,6

1

2

n1

0,2

1

2

n2

0,1

1

2

n3

0,5

1

3

n1

0,7

2

3

n2

0,5

2

3

n3

0,1

2

4

n1

0,4

2

4

n2

0,9

2

4

n3

0,3

2

5

n1

0,5

3

5

n2

0,1

3

5

n3

0,5

3

I can't use PIVOT of SQL Server, because i don't know neurons count...

I suppose, that i need to build the cube. And after that to build Neural Network model.

It is correct? There Is any other way?

|||

You could model like below:

case_id n1 n2 n3 output

1 0.1 0.1 0.6 1

2 ...

Alternatively, you could use the nested table feature and create a model like below. This would work if you do not know the number of inputs before modeling

(
Case_ID long key,
Output long continuous predict,
inputs Table
(
neuron text key,
weight double continuous
)
)

Assuming that you have a data source view containing the table above, you can train the mining structure/model self-joining the table with itself (using Case_ID as Key and then as Foreign Key).

In the wizard, you can specify the same table as both Case and Nested as long as you define a one-to-many reflexive relationship from case_id to case_id.

|||

Thank you for help again!

All works except for a Neural Network Viewer.

The NN Viewer gives out a message:

"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format..."

But it is not important. Prediction query works correct.

Once again thanks!

|||

Glad to hear it works

> All works except for a Neural Network Viewer

Some viewers problems were fixed in SP2 of SQL Server 2005, which should be available publicly soon

No comments:

Post a Comment