Skip to content

Conversation

@bdewilde
Copy link
Collaborator

Description

  • Use spaCy's new DocBin class for saving/loading Corpus data
  • Allow parallel processing when adding texts/records to a Corpus
  • Bump min spaCy version accordingly, to 2.2.0

Motivation and Context

Just availing myself of new spaCy functionality, and replacing some hacks in the process.

How Has This Been Tested?

All tests pass, including a couple new ones.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation, and I have updated it accordingly.

@bdewilde bdewilde merged commit 662b160 into develop Dec 30, 2019
@bdewilde bdewilde deleted the docbin-serialization branch December 30, 2019 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants