Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems storing both numeric and string data within a single channel #8

Open
chrisbartley opened this issue Jan 28, 2016 · 0 comments

Comments

@chrisbartley
Copy link
Member

Problem Description

I'm not sure whether this is even supposed to work, but storing both numeric and string data within a single channel results in three issues:

  1. Doing an export on such a channel will only return the numeric data. However, doing a tile request returns both.
  2. Inserting a sample with string data for a timestamp which has no numeric data causes a numeric data value to be inserted as well, apparently from the neighboring timestamp.
  3. Doing a gettile for a data sample at a time that has a string value but no numeric value returns 0 for the value. One would expect null instead.

Before illustrating the steps to reproduce, assume we have the following two JSON data files to insert into the datastore:

data1.json

{
   "channel_names" : ["foo"],
   "data" : [
      [1450000001, 10],
      [1450000002, 20],
      [1450000004, 40],
      [1450000005, 50],
      [1450000006, 60],
      [1450000007, 70],
      [1450000008, 80],
      [1450000009, 90]
   ]
}

data2.json

{
   "channel_names" : ["foo"],
   "data" : [
      [1450000003, "thirty"],
      [1450000007, "seventy"]
   ]
}

Steps To Reproduce

begin by inserting the first data file:

./bin/import --format json ./data-test 100 cpb_device ./data1.json

It should succeed with the following response:

{
   "channel_specs" : {
      "foo" : {
         "channel_bounds" : {
            "max_time" : 1450000009,
            "max_value" : 90,
            "min_time" : 1450000001,
            "min_value" : 10
         },
         "imported_bounds" : {
            "max_time" : 1450000009,
            "max_value" : 90,
            "min_time" : 1450000001,
            "min_value" : 10
         }
      }
   },
   "failed_records" : 0,
   "max_time" : 1450000009,
   "min_time" : 1450000001,
   "successful_records" : 1
}

Now do an export to verify:

./bin/export --csv ./data-test 100.cpb_device.foo

It should print the following:

EpochTime,100.cpb_device.foo
1450000001,10
1450000002,20
1450000004,40
1450000005,50
1450000006,60
1450000007,70
1450000008,80
1450000009,90

Also verify by requesting a tile:

./bin/gettile ./data-test 100 cpb_device.foo -5 90625000

You should get the following:

{
   "data" : [
      [1450000001, 10, 0, 1],
      [1450000002, 20, 0, 1],
      [1450000004, 40, 0, 1],
      [1450000005, 50, 0, 1],
      [1450000006, 60, 0, 1],
      [1450000007, 70, 0, 1],
      [1450000008, 80, 0, 1],
      [1450000009, 90, 0, 1],
      [1450000012.5, -1e308, 0, 0]
   ],
   "fields" : ["time", "mean", "stddev", "count"],
   "level" : -5,
   "offset" : 90625000
}

So far, so good. Now, insert the second data file. Note that this data file contains two string values, one at time 1450000003 and another at time 1450000007. Looking at data1.json, we see that there's no existing numeric data value for this channel at time 1450000003, but there is one (70) for time 1450000007.

./bin/import --format json ./data-test 100 cpb_device ./data2.json

It should succeed with the following response:

{
   "channel_specs" : {
      "foo" : {
         "channel_bounds" : {
            "max_time" : 1450000009,
            "max_value" : 90,
            "min_time" : 1450000001,
            "min_value" : 10
         },
         "imported_bounds" : {
            "max_time" : 1450000007,
            "min_time" : 1450000003
         }
      }
   },
   "failed_records" : 0,
   "max_time" : 1450000007,
   "min_time" : 1450000003,
   "successful_records" : 1
}

Now do another export to verify:

./bin/export --csv ./data-test 100.cpb_device.foo

EpochTime,100.cpb_device.foo
1450000001,10
1450000002,20
1450000003,40
1450000004,40
1450000005,50
1450000006,60
1450000007,70
1450000008,80
1450000009,90

So, there are the first two problems: no string values are getting exported at all and a numeric value (which we never inserted) is getting returned at time 1450000003. There's of course the question of how to report 2 values for a single timestamp (as with time 1450000007), but that's more of an implementation detail. Regardless, one would expect to at least see a string value for time 1450000003.

Now do a gettile to see the difference:

./bin/gettile ./data-test 100 cpb_device.foo -5 90625000

{
   "data" : [
      [1450000001, 10, 0, 1, null],
      [1450000002, 20, 0, 1, null],
      [1450000002.5, -1e308, 0, 0, null],
      [1450000003, 0, 0, 1, "thirty"],
      [1450000003.5, -1e308, 0, 0, null],
      [1450000004, 40, 0, 1, null],
      [1450000005, 50, 0, 1, null],
      [1450000006, 60, 0, 1, null],
      [1450000007, 70, 0, 1, "seventy"],
      [1450000008, 80, 0, 1, null],
      [1450000009, 90, 0, 1, null],
      [1450000012.5, -1e308, 0, 0, null]
   ],
   "fields" : ["time", "mean", "stddev", "count", "comment"],
   "level" : -5,
   "offset" : 90625000
}

For gettile, we get the string values, but notice that the value at time 1450000003 is no longer being reported as 40--it's now 0. One would expect a null value, just like there are null values for the comment fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant