compromise 是一个开源的,基于 JavaScript 的自然语言处理库
compromise 是一个开源的,基于 JavaScript 的自然语言处理库 spencermountain released this
- [change] improved support for fractions in numbers-plugin #793
- [change] remove zero-width characters in normalized output #759
- [change] improved Person tagging with particles #794
- [change] improved i18n Person names
- [change] tagger+tokenization fixes
- [change] remove empty results from .out('array') #795
- [change]
.tokenize()
runs any postProcess() scripts from plugins - [change] improved support for lowercase acronyms
- [change] - support years like '97
- [change] - change tokenizer for '20-aug'
- [change] - update deps of all plugins
- [fix] - NumberRange tagging issue #795
- [fix] - improved support for ordinal number ranges
- [fix] - improved regex support in match-syntax
- [fix] - improved support for
softmatch syntax #797 - [fix] - better handling of
{0,n}
match syntax - [new] - new plugin
strict-match
- [new] - set NounPhrase, VerbPhrase tags in nlp-sentences plugin
- [new] -
.phrases()
method in nlp-sentences plugin - [new] - support
.apppend(doc)
and.prepend(doc)
- [new] -
values.normalize()
method
Assets
2
- [change] many misc tagging fixes
- 'if' is now a #Preposition
- possessive pronouns are #Pronoun and #Possessive
- more phrasal verbs
- make #Participle tag #PastTense
- favor #PastTense over #Participle interpretation in tagger
- [change]
@hasHyphen
returns false for sentence dashes - a lot more testing
Assets
2
Assets
2
add new parseMatch() method for pre-computed match statements, and faster lookups
Assets
2
spencermountain released this
Assets
2
-support unicode spaces for #759
- major improvements to
compromise-plugin-dates
(1.0.0)
Assets
2
13.2.0
- deprecate
.money()
and favour overloaded method in compromise-numbers plugin - add
.percentages()
and.fractions()
to compromise-numbers plugin - add
.hasAfter()
and.hasBefore()
methods - change handling of slashes
- add
.world()
method to constructor - add more abbreviations
- fix regex backtracking #739
- tokenize build:
-
- remove conjugation and inflection data
-
- remove conjugation and inflection functions
- remove sourcemap from build process (too big)
- improvements to
.numbers().units()
- fix for linked-list runtime error #744 with contractions
Assets
2
- fix
verbs.json()
runtime-error - improve empty
.lists()
methods - allow custom tag colors
- test new github action workflow
Assets
2
13.0.0
major changes to .export()
and [capture] group
match-syntax.
- [breaking] move .export() and .load() methods to plugin (compromise-export)
-
- change .export() format - this hasn't worked properly since v12. (mis-parsed contractions) see #669
- [breaking] split
compromise-output
intocompromise-html
andcompromise-hash
plugins - [breaking]
.match('foo [bar]')
no-longer returns 'bar'. (use.match('foo [bar]', 0)
) - [breaking] capture groups are no longer merged.
.match('[foo] [bar]')
returns two groups accessible with the new.groups()
function - [breaking] change
.sentences()
method to return only full-sentences of matches (use.all()
instead)
modifications:
- fix nlp.clone() - hasn't worked properly, since v12. (@Drache93)
- fix issues with greedy capture [*] and [.+] -(@Drache93)
💛 - add whitespace properties (pre+post) to default json output (suppress with
.json({ whitespace: false })
) .lookup({ key: val })
with an object now returns an object back ({val: Doc})- add nlp constructor as a third param to
.extend()
- support lexicon object param in tokenize -
.tokenize('my word', { word: 'tag' })
- clean-up of scripts and tooling
- improved typescript types
- add support for some french contractions like
j'aime -> je aime
- allow null results in
.map()
function
new things:
- add new named-match syntax, with .groups() method (@Drache93)
- add
nlp.fromJSON()
method - add a new
compromise-tokenize.js
build, without the tagger, or data included.
Assets
2
12.3.0
- prefer
@titleCase
instead of#TitleCase
tag - update dependencies
- fix case-sensitive paths
- fix greedy-start match condition regression #651
- fix single period sentence runtime error
- fix potentially-unsafe regexes
- improved tagging for '-ed' verbs (#616)
- improve support for auxilary-pastTense ('was lifted') verb-phrases
- more robust number-tagging regexes
- setup typescript types for plugins #661 (thanks @Drache93!)
- verb conjugation and tagger bugfixes
- disambiguate acryonyms & yelling
12.2.1
- fix 'aint' contraction
- make Doc.world writable
- update deps
- more tests
- fix shared period with acronym at end of sentence
- fix some mis-classification of contraction
- fix over-active emoji regex
- tag 'cookin', 'hootin' as
Gerund
- support unicode single-quote symbols in contractions
12.2.0
- improved splitting in .nouns()
- add
.nouns().adjectives()
method - add
concat
param to.pre()
and.post()
- allow ellipses at start of term "....so" in
@hasEllipses
- fix matches with optional-end
foo?$
match syntax - add typescript types for subsets
12.1.0
- add 'sideEffect:false' flag to build
- considerable speedup (20%) in tagger
- ensure trimming of whitespace for root/clean/reduced text formats
- fix client-side logging
- more flexible params to
replace()
andreplaceWith()
Assets
2
spencermountain released this
compromise is a modest library that does natural-language processing in javascript.
it was built to make searching and transforming human-text easy and playful.
I'm very proud to release compromise v12, our strongest, fastest, and smallest release yet.
Assets
2
Assets
2
spencermountain released this
couple bug fixes
Assets
2
Watchers:182 |
Star:9730 |
Fork:618 |
创建时间: 2011-07-05 17:04:38 |
最后Commits: 5天前 |
许可协议:MIT |
分类:其它杂项 / JavaScript开发 |
收录时间:2019-03-08 21:50:17 |
d486360
Compare
Verified
!
syntaxcompromise-dates@1.4.0
,compromise-numbers@1.2.0