Corpus scale
Vocabulary: 25970
Docs: 1000
Tokens: 106776
Topics: 1000
trainning batch size:100
cluster has 20 servers, each server has 8 core cpub, 48GB mem
I measure the performance of running LDA
prefetch time: 700 ms
build table time: 200 ms
sample time: 12 ms
push time: 200 ms
so, build table time is too long compared to prefetch time & pull time, there are some method to improve build table time?