Can't train PatchTST on different (than X) dimensions for target #713

strakehyr · 2023-03-23T14:44:58Z

Hi @oguiza and thanks again for implementing a SOTA model.

Seem to have run into a limitation for PatchTST in which I can't seem to train it for a different dimension X compared to y. As opposed to, for instance, TSTPlus where we could have several X covariates and then different number of Y series (with different dimensions). I might just be using the wrong approach for this, as your PatchTST example uses a sliding window on the same variables for X and Y.

The TSTPlus case:

X.shape, y.shape
arch_config = dict(
    n_layers=3,  
    ks = 4,
    n_heads=4,  
    d_model=16,  
    d_ff=128,  
    dropout=0.3,
)
learn = TSForecaster(X, y, splits=splits, batch_size=16, path="models", pipelines=[exp_pipe],
                     arch="PatchTST", arch_config=arch_config, metrics=[mse, mae], cbs=ShowGraph())


n_epochs = 100
lr_max = 0.0025
lr_max = learn.lr_find().valley

((7416, 168, 4), (7416, 24, 3))
Epoch 1/1 : |█████---------------| 28.57% [92/322 00:01<00:03 1.1506]

This doesn't seem to be the case for the PatchTST:

arch_config = dict(
    n_layers=3,
    n_heads=4,
    d_model=16,
    d_ff=128,
    attn_dropout=0.0,
    dropout=0.3,
    patch_len=1,
)
learn = TSForecaster(X, y, splits=splits, batch_size=16, path="models", pipelines=[exp_pipe],
                     arch="PatchTST", arch_config=arch_config, metrics=[mse, mae], cbs=ShowGraph())
n_epochs = 100
lr_max = 0.0025
lr_max = learn.lr_find().valley

RuntimeError: The size of tensor a (8064) must match the size of tensor b (1152) at non-singleton dimension 0

The text was updated successfully, but these errors were encountered:

oguiza · 2023-03-23T16:28:54Z

Thanks for raising this @strakehyr.
You are absolutely right. This is a limitation of the current PatchTST model.
I'd like to develop a PatchTSTPlus model that allows to use it in scenarios like the one you described above.
This is the normal process that I've followed in the library. The standard version replicates as close as possible the model published in the paper/ code. And then a Plus version add additional functionality. Forecasting any # variables is one of those scenarios. Another one is Classification or Regression.
I'd like to work on this soon, but need to find the time or resources. Would you be interested in creating a PR? If your are interested I could you give you some direction on what needs to be done.

vrodriguezf · 2023-03-24T13:22:33Z

I am interested in this too. I might have some time to put my hands on it in about two weeks

strakehyr · 2023-04-19T12:15:41Z

I would be happy to help, but I don't currently have the time.

oguiza added enhancement New feature or request ideas New ideas to enhance tsai labels Mar 23, 2023

oguiza mentioned this issue Mar 24, 2023

multi-horizon forecasting #591

Closed

Valhir924 mentioned this issue May 21, 2023

adjust PatchTST into a classification mission. #773

Open

E-Penguin mentioned this issue Jul 4, 2023

PatchTST output/target tensor mismatch in loss function #803

Open

oguiza mentioned this issue Sep 3, 2023

Implementing PatchTST but on a different type of supervised data #825

Open

erickmiller mentioned this issue May 30, 2024

Which models can I use? TSRegressor #887

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't train PatchTST on different (than X) dimensions for target #713

Can't train PatchTST on different (than X) dimensions for target #713

strakehyr commented Mar 23, 2023

oguiza commented Mar 23, 2023

vrodriguezf commented Mar 24, 2023

strakehyr commented Apr 19, 2023

Can't train PatchTST on different (than X) dimensions for target #713

Can't train PatchTST on different (than X) dimensions for target #713

Comments

strakehyr commented Mar 23, 2023

oguiza commented Mar 23, 2023

vrodriguezf commented Mar 24, 2023

strakehyr commented Apr 19, 2023