Author | Topic: extracting text and images | |
---|---|---|
Zdenko Bielik | extracting text and images on Tue, 05 Oct 2010 12:51:46 +0200 Hi gurus, I need save content of word document as separate text files and images. e.g.: from attached file I need save - first section of "a" characters like file "section-1.txt", - second section of "b" characters like file "section-2.txt", - red image like file "image-1.jpg", - third section of "c" characters like file "section-3.txt", - four section of "d" characters like file "section-4.txt", - blue image like file "image-2.jpg", - yellow image like file "image-3.jpg", - 5 section of "e" characters like file "section-5.txt", I have none experience with ActiveX. Please, can someone help me with this? TIA & Regards Zdeno ActiveX test.doc | |
AUGE_OHR | Re: extracting text and images on Tue, 05 Oct 2010 18:08:18 +0200 hi, > I need save content of word document as separate text files and images. start Word Macro Recorder (ALT-F11) do "manuell" your "Action" as you like to do stop Macro > I have none experience with ActiveX. now open Macro and look "inside". if you have any "Code" we can help you to "translate" it > Please, can someone help me with this? im not shure if Word can do this. How will Word "recognize" what you want ? i think you have to use a OCR for it greetings by OHR Jimmy | |
Zdenko Bielik | Re: extracting text and images on Tue, 05 Oct 2010 19:03:56 +0200 Hi Jimmy, > im not shure if Word can do this. How will Word "recognize" what you want > ? hmmm, I understand... so, other question: is it possible save whole text from doc file in one txt file and included images in separate files? Regards Zdeno | |
Thomas Braun | Re: extracting text and images on Wed, 06 Oct 2010 15:42:56 +0200 Zdenko Bielik wrote: > hmmm, I understand... > so, other question: is it possible save whole text from doc file in one txt Yes - you can do a "save as..." operation and specify text as the target format. When doing this with the macro recorder, you will get VB code similar to this: Sub Makro1() ' ' Makro1 Makro ' Makro aufgezeichnet am 06.10.2010 von Thomas Braun ' ChangeFileOpenDirectory _ "C:\Dokumente und Einstellungen\thomas.braun\Desktop\" ActiveDocument.SaveAs FileName:="dsfsa.txt", FileFormat:=wdFormatText, _ LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _ :="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _ SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _ False, Encoding:=1252, InsertLineBreaks:=False, AllowSubstitutions:=False _ , LineEnding:=wdCRLF End Sub > file and included images in separate files? You could save the file as a html document and take the pictures from the subfolder generated by word: http://vbadud.blogspot.com/2010/05/how-to-retrieve-images-of-word-document.html Thomas | |
Zdenko Bielik | Re: extracting text and images on Thu, 07 Oct 2010 12:46:02 +0200 Hi Thomas, thanks for info and link. Unfortunately, I don't know, how translate VB code to Xbase++... I will try ask Jimmy. Thanks Zdeno | |
Zdenko Bielik | Re: extracting text and images on Thu, 07 Oct 2010 12:49:55 +0200 Hi Jimmy, (or someone else), > Yes - you can do a "save as..." operation and specify text as the target > format. When doing this with the macro recorder, you will get VB code > similar to this: > You could save the file as a html document and take the pictures from the > subfolder generated by word: > > http://vbadud.blogspot.com/2010/05/how-to-retrieve-images-of-word-document.html can you help me now with my problem? Thomas posted any workaround and link with VB solution, but I don't know, how can be this translated into Xbase++... TIA Zdeno | |
AUGE_OHR | Re: extracting text and images on Thu, 07 Oct 2010 17:48:05 +0200 hi, >> Yes - you can do a "save as..." operation and specify text as the target >> format. When doing this with the macro recorder, you will get VB code >> similar to this: > ... there is not much to "translate" (see later) >> You could save the file as a html document and take the pictures from the >> subfolder generated by word: >> >> http://vbadud.blogspot.com/2010/05/how-to-retrieve-images-of-word-document.html > > can you help me now with my problem? > Thomas posted any workaround and link with VB solution, > but I don't know, how can be this translated into Xbase++... when you "SaveAs" Html, Word will create a Subfolder so you just have to search with Directory() for your Picture. The Name of Subfolder, in my German Version, is FileName+"-Dateien" so you have to look what Name your Version use. greetings by OHR Jimmy *** Code *** * allways use "full" Path #include "activex.ch" #include "common.ch" #define wdFormatHTML 8 PROCEDURE MAIN(cFile,cSaveAs) LOCAL oWord,oDoc LOCAL cPath := CURDRIVE()+":\"+CurDir()+"\" DEFAULT cFile TO "ActiveX test.doc" DEFAULT cSaveAs TO "ActiveX test.htm" IF FILE(cPath+cFile) oWord := CreateObject("Word.Application") IF Empty( oWord ) MsgBox( "Microsoft Word not installed" ) ENDIF oWord:visible := .T. open DOC oWord:documents:open( cPath+cFile ) oDoc := oWord:ActiveDocument saveAs HTML oDoc:saveas(cPath+cSaveAs,wdFormatHTML) close DOC oDoc:close() oWord:Quit() oWord:destroy() now search for Subfolder with Directory() ELSE MsgBox( "File "+cPath+cFile+" not found" ) ENDIF RETURN *** EOF *** | |
Zdenko Bielik | Re: extracting text and images on Thu, 07 Oct 2010 19:26:52 +0200 Hi Jimmy, thank you!!! Works great! > #define wdFormatHTML 8 Please, can you post here all other possible "define constants" for file types? TIA Zdeno | |
AUGE_OHR | Re: extracting text and images on Thu, 07 Oct 2010 20:59:25 +0200 hi, >> #define wdFormatHTML 8 > Please, can you post here all other possible "define constants" for file > types? while it is different for each Office Version you have to "generate" it yourself TLB2CH.EXE "WORD.APPLICATION" /o:MyWord.CH greetings by OHR Jimmy |