-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CREATE FORECASTING MODEL #6861
Comments
Okay, this is actually expected behavior. What passes as the correct behavior in the video above has to do with the lightwood handler inside What is happeningAll time series predictions in lightwood are handled as "bulk" predictions. You just take a bunch of measurements and transform some of the columns into arrays, depending on the problem definition. At the end of this transformation, there will always be "cold-start" rows where the previous # of entries is not enough to fill the entire
Let's say the above timestamp is
And the mixers will be trained to predict the value in Notice how we can't add the actual value into the priors array because otherwise we end up with no supervision signal. It needs to be like this. Which means that when bulk training and by extension when bulk predicting, the prediction always starts at the latest available timestamp for each row. For the simplest case of If we now consider SolutionAs seen in the previous section, there is no way we can change the alignment in lightwood because we need it to be like this in order for the supervision signal to exist and flow. What we can do in the lightwood handler, however, is to activate row timestamp inference for all time series predictions, including |
Part of the solution will actually be Lightwood-side. Fixed in mindsdb/lightwood#1075, the idea is to enable forced out-of-sample row inference even when the offset is not manually set. This will be made optional as breaking the Upstream, the |
Aiming for EOD tuesday. EDIT: Current situation (as opposed to OP) is that |
Finalise discussion (https://docs.google.com/spreadsheets/d/1xGsyTcfojNpsZEJ6N0GxzFrS-IwsdoREA_6xFHlw2_w/edit#gid=398055419) and make changes |
Do in Q1B |
@tomhuds suggest moving this issue into the mindsdb repo, as all lightwood-side changes have been completed. |
Did testing - see 'proposed' sheet, rows 62 and below:
|
Note: JOIN should be optional, given the data is obvious |
@tomhuds feels like we should create separate issues to track what the recently discussed plan? This one is a bit overloaded. |
Priority TBD given exercise with ML team this week. |
skipping for now - focus on anomaly detection v2 |
Is there an existing issue for this?
Current Behavior
When making bulk timeseries predictions (e.g. the below), I would expect there to be HORIZON number or rows at the end.
Example of bulk ts predictions query:
SELECT m.received_at as received_at_model,
t.received_at as recieved_at_input,
t.temp as temp_true,
m.temp as temp_pred,
m.temp_explain as temp_pred_explain
FROM mindsdb.model_v5 as m
JOIN files.dataset as t
WHERE t.received_at > '2022-08-01 07:00:00';
Expected Behavior
HORIZON rows at the end
https://www.loom.com/share/e2a37269852f4f0082e187c57e893b33
Steps To Reproduce
Lightwood staging, Mindsdb staging https://docs.google.com/document/d/1_duUhNR_hEta0sZrQyo7A8WlQj1HjYyI5aWiPpxx9cw/edit?usp=sharing
Anything else?
No response
The text was updated successfully, but these errors were encountered: