Install docx2txt on Ubuntu 20.04

The docx2txt is a command line tool for converting MS Word (DOCX) files to plain text files while preserves some formatting. This tool requires the Perl interpreter.

This tutorial shows how to install docx2txt on Ubuntu 20.04.

Install docx2txt

Execute the following command to update the package lists:

sudo apt update

Install docx2txt:

sudo apt install -y docx2txt

Testing docx2txt

Download DOCX file for testing:

wget -O test.docx https://raw.githubusercontent.com/dbashford/textract/master/test/files/docx.docx

Run the docx2txt command to convert DOCX file to plain text file:

docx2txt test.docx test.txt

Check the content of a plain text file:

cat test.txt
This is a test
Just so you know:
...........

Results can be written to standard output by providing a dash (-) as the output file name:

docx2txt test.docx -

Uninstall docx2txt

If the docx2txt is no longer needed, you can remove it with command:

sudo apt purge --autoremove -y docx2txt

Leave a Comment

Your email address will not be published.