alWAYS Beginner: Tnt tagging through python

Thursday, 17 September 2015

Tnt tagging through python

From my previous post, I hope all of you know how to do tagging using TnT tagger. To tag a file called test_file by the model tagged_file can be done by following command in terminal

$ ./tnt tagged_file test_file
And if we want to save the tagged file, attempt following cammand instead of above
$ ./tnt tagged_file test_file > output.txt

Here is an another way to tag test_file using via python code, here the tagged output can be stored in another file for future reference as above. Add the following lines to python script

import os

executable = '/home/ajuna/components/tnt/tnt' #path to tnt folder

tnt_model = '/home/ajuna/components/tnt/tagged_file' #path to model named tagged_file

def run_tnt(input='tokenized_file', output='taggedoutput.txt'):

"call tnt from Linux"

tnt_command = '%s %s %s > %s' % (tnt_executable, tnt_model, input, output)

os.system(tnt_command)

run_tnt('tagged_file', 'output.txt')

#here tagged_file is the input file to be tagged (which contains words), and output.txt is the file where output to be saved(words with tag after running TnT tagger)

4 comments:

THULU28 May 2016 at 22:48
This comment has been removed by the author.
ReplyDelete
Replies
Aash12 March 2017 at 09:18
Error:

Building suffix trie (113 lowercase, 0 uppercase)
Estimating lambdas (done)
lambda1 = 2.857143e-01 lambda2 = 2.857143e-01 lambda3 = 4.285714e-01
lam_bi1 = 3.333333e-01 lam_bi2 = 6.666667e-01
suffix theta = 1.947190e-01
Error: cannot find corpus 'tagged_file'
ReplyDelete
Replies
archaunni24 April 2017 at 08:17
First need to create a tagged_file for training purpose which contains list of words with corresponding tags.
ReplyDelete
Replies
Unknown6 July 2018 at 22:45
this gives wrong output
ReplyDelete
Replies