The docx2txt is a command line tool for converting MS Word (DOCX) files to plain text files while preserves some formatting. This tool requires the Perl interpreter.
This tutorial shows how to install docx2txt on Ubuntu 24.04.
Install docx2txt
Execute the following command to update the package lists:
sudo apt update
Install docx2txt:
sudo apt install -y docx2txt
Testing docx2txt
Download DOCX file for testing:
wget -qO test.docx https://raw.githubusercontent.com/dbashford/textract/master/test/files/docx.docx
Run the docx2txt
command to convert DOCX file to plain text file:
docx2txt test.docx test.txt
Check the content of a plain text file:
cat test.txt
This is a test
Just so you know:
...........
Results can be written to standard output by providing a dash (-
) as the output file name:
docx2txt test.docx -
Uninstall docx2txt
If the docx2txt is no longer needed, you can remove it with command:
sudo apt purge --autoremove -y docx2txt
Leave a Comment
Cancel reply