word2vec随记（word2vec中的bin文件转换为txt）

瘋誑徳嬡仩 2022-11-01 21:31:09

from gensim.models import word2vec model = word2vec.Word2Vec.load_word2vec_format('/home/ubuntu/word2vec/PubMed-w2v.bin', binary=True) model.save_word2vec_format('/home/ubuntu/word2vec/PubMed-w2v.txt', binary=False)

运行代码时提醒错误：

word2vec随记（word2vec中的bin文件转换为txt）(1)

作为轻度强迫症的我，看到这个UserWarning 极为不爽快，于是就安装 Pattern, 也是各种安装不成功，网上搜索原因的时候发现：python2.x 才支持Pattern ，而我用的是python3.5.2, 不能因为个Warning退回Python2.x 于是我忍了。果然只是轻度强迫症

第二种方法（与第一种大同小异，不过也记录一下）

from gensim.models import word2vec model = word2vec.Word2Vec.load_word2vec_format('Path/to/GoogleNews-vectors-negative300.bin', binary=True) model.save("file.txt")

第三种方法（其实都一样啦） import codecs from gensim.models import Word2Vec def main(): path_to_model = 'GoogleNews-vectors-negative300.bin' output_file = 'GoogleNews-vectors-negative300_test.txt' export_to_file(path_to_model, output_file) def export_to_file(path_to_model, output_file): output = codecs.open(output_file, 'w' , 'utf-8') model = Word2Vec.load_word2vec_format(path_to_model, binary=True) print('done loading Word2Vec') vocab = model.vocab for mid in vocab: #print(model[mid]) #print(mid) vector = list() for dimension in model[mid]: vector.append(str(dimension)) #line = { "mid": mid, "vector": vector } vector_str = ",".join(vector) line = mid "\t" vector_str #line = json.dumps(line) output.write(line "\n") output.close() if __name__ == "__main__": main()

word2vec随记（word2vec中的bin文件转换为txt）(2)

展开全文

免责声明：本文仅代表文章作者的个人观点，与本站无关。其原创性、真实性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容文字的真实性、完整性和原创性本站不作任何保证或承诺，请读者仅作参考，并自行核实相关内容。文章投诉邮箱：anhduc.ph@yahoo.com

秒懂生活

word2vec随记（word2vec中的bin文件转换为txt）

猜您喜欢

建行临沂分行兰山支行（助推1-8月个人存款新增）

蜜蜂不冬眠吗（不爱冬眠爱蜂蜜）

毕亚兹usb集线器（笔记本的神奇搭档）

天天爱消除8月换装（超多惊喜来袭天天爱消除阿童木版本上线）

游戏手机优缺点（游戏手机缘何又稳又流畅）

英伟达显卡驱动442.50（英伟达发布516.40显卡驱动）

厨房垃圾处理器排行榜（6款厨房垃圾处理器）

热门推荐

排行榜