Skip to content

Conversation

@pommedeterresautee
Copy link
Contributor

@pommedeterresautee pommedeterresautee commented Sep 23, 2019

Optimize the way padding is done by reducing the number of dictionary call.
Remove the conversion to UTF-8, and take care of old models in old format.
In the future, the conversion may be removed safely.

close #1133
@alanakbik finally I did both, change in dictionary deserialization and change inside the class itself.

CONNL 2003 from 24s to 23s (-4%)
French dataset from 12s to 11s (-8%)

@pommedeterresautee
Copy link
Contributor Author

There was strange issues because of dictionary conversion, so I finally made a small version of the optimization, keeping things like they are :-( May be to investigate in the future

@alanakbik
Copy link
Collaborator

Ah ok I was just measuring speed differences and finding they are roughly the same. Is this because of the last changes?

@pommedeterresautee
Copy link
Contributor Author

Here I still have 23s and 11s for CONNL and French dataset. May be it's a less than 1 sec change but with rounding it appears bigger?

@alanakbik
Copy link
Collaborator

Ok great - looks good! Thanks for all your help!

@alanakbik
Copy link
Collaborator

👍

1 similar comment
@yosipk
Copy link
Collaborator

yosipk commented Sep 23, 2019

👍

@alanakbik alanakbik merged commit 799c8dc into flairNLP:master Sep 23, 2019
alanakbik pushed a commit that referenced this pull request Oct 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Why each letter is encoded in UTF-8?

3 participants