Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minirocket giving different test accuracies with every model on the same test data #745

Open
hailthedawn opened this issue Apr 20, 2023 · 3 comments
Labels
under review Waiting for clarification, confirmation, etc

Comments

@hailthedawn
Copy link

I am using the MiniRocket classifier, and trying to perform emotion detection on ~1700 utterances. I've segmented the data into train and test, with about 300-400 utterances in the test set. Each utterance corresponds to an output label (of which there are three total).
Every time I run the build model -> get test accuracy cells, I'm getting very different test accuracies (ranging from 30% accuracy to 80%). What could be the reason behind this? I know my dataset is small - is that the reason? What do you recommend I do to get more consistencies with MiniRocket?

Note: What I am actually providing as input to the classifier is CNN embeddings corresponding to each utterance.

@hailthedawn
Copy link
Author

@oguiza Sorry for the tag - I noticed you have answered Minirocket questions in the past, so thought I would ask you if you have any recommendations wrt this?

@oguiza oguiza added the under review Waiting for clarification, confirmation, etc label May 4, 2023
@oguiza
Copy link
Contributor

oguiza commented May 4, 2023

Hi @hailthedawn,
That's strange. I haven't seen that type of variation in the score.
Are you using something similar to this?

from tsai.models.MINIROCKET import MiniRocketClassifier

# Univariate classification with sklearn-type API
dsid = 'OliveOil'
X_train, y_train, X_valid, y_valid = get_UCR_data(dsid)   # Download the UCR dataset

# Computes MiniRocket features using the original (non-PyTorch) MiniRocket code.
# It then sends them to a sklearn's RidgeClassifier (linear classifier).
model = MiniRocketClassifier()
timer.start(False)
model.fit(X_train, y_train)
t = timer.stop()
print(f'valid accuracy    : {model.score(X_valid, y_valid):.3%} time: {t}')

@cedced19
Copy link

Hi this is strange I don't find the same behavior as you. Me I get good results, until it comes to classify the PenDigits dataset (from UCR dataset), it gives me always the same accuracy: 0.106 ...
I think I miss something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
under review Waiting for clarification, confirmation, etc
Projects
None yet
Development

No branches or pull requests

3 participants