Korean NLP foundation · Verified
NSD-Korean-NLP-Foundation
한국어 텍스트 처리·언어 분석·담화 구조 통합 자산.
Unified Korean text processing, syntax, and discourse stack.
형태소·구문·담화·표지어·사전·구조 분석을 자체 기술로 내재화. 외부 의존 없이 단일 시스템으로 운영. 한국어 자유응답 자동 채점 baseline 대비 +6.16%p 정확도 향상 (komoran 트랙) + 추가 +1.08%p (khaiii 트랙). 28 도메인 약 1.075M 문장 대규모 평가 진행중.
Morphology, syntax, discourse, markers, dictionary, and structure analysis — owned in-house, single-system operation. +6.16%p Korean essay-scoring accuracy lift (komoran track) plus +1.08%p (khaiii track). Large-scale evaluation across ~28 domains / ~1.075M sentences in progress.
Mini-dashboard · indicators only
Verified signals. 추세·비교·상태만 — 절대 수치 일부는 공개 보류.
+6.16%p
Accuracy lift
GT 500 · 5-run identical
~15×
Throughput ratio
22 r/s → 330 r/s · concurrency 16
765 / 765
Bit-identical cases
100% pass · multi-set
Throughput · before vs after (req/s)
200 req sweep · concurrency 16 · workers=4 unified container replaces multi-container ensemble
Worker scaling · req/s at concurrency 16
Single container scales linearly with worker count · w4 production default (9.6 GiB memory)
Bit-identical validation
765 / 765 cases
224 + 229 + 312 across 3 sets
HTTP-response-level identity · 5 runs
Components in production
All in-house · zero external analyzer dependencies
Included NSD products
- NSD-Tokenizer-v2.2
- NSD-Korean-Syntax-Analyzer
- NSD-Connector-Analyzer-v2
- NSD-Discourse-Analyzer
- NSD-Unified-Discourse-BSNC
- NSD-Korean-IntDic
Verified internally · 2026-04 · 28 domain / 1.075M sentence large-scale evaluation in progress.