Commit 937535d
Allow dictionaries to overwrite entries with #fairseq:overwrite comment (#1073)
Summary:
[This commit](dd1298e) made it so that duplicate entries in a dictionary are ignored. Unfortunately the Camembert model depends on overwriting `<unk>`, `<s>` and `</s>`.
The proposed solution here is to allow the dictionary to have entries like:
```
<unk> 999 #fairseq:overwrite
<s> 999 #fairseq:overwrite
</s> 999 #fairseq:overwrite
, 999
▁de 999
. 999
(...)
```
These will preserve the old overwriting behavior. Thus we can release a new `camembert.v0.tar.gz` with a dictionary like above and it works.
Pull Request resolved: fairinternal/fairseq-py#1073
Reviewed By: kahne
Differential Revision: D20284569
Pulled By: myleott
fbshipit-source-id: bf78fbff13c94bf8a6485cbdda62305ddc30c0561 parent 3dd221c commit 937535d
2 files changed
+70
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
94 | | - | |
| 94 | + | |
95 | 95 | | |
96 | | - | |
| 96 | + | |
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
218 | 219 | | |
219 | | - | |
220 | | - | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
221 | 240 | | |
222 | | - | |
| 241 | + | |
223 | 242 | | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | 243 | | |
228 | 244 | | |
229 | 245 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
65 | 66 | | |
66 | 67 | | |
67 | 68 | | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
68 | 114 | | |
69 | 115 | | |
70 | 116 | | |
0 commit comments