English Slovenčina
English Slovenčina

Skriptovanie COAR-DMS v jazyku Python

Scripting COAR-DMS in Python

Python je programovací jazyk vysokej úrovne pre všebecné použitie. Pre COAR-DMS bol našou jednoznačnou voľbou, ktorou zabezpečujeme vysokú efektivitu skriptovania. S jeho pomocou dosahujeme rýchle prototypovanie úloh a implementáciu. Vysoká produktivita práce, ktorú prináša jazyk Python, umožňuje skrátiť dobu implemenácie COAR-DMS a tým znížiť náklady. Zdrojové kódy programov sú dobre čitateľné čo skracuje dobu opráv a implementácie nových funkcií.

Python is a high level programming language for general use. Python was our unequivocal choice for use in COAR-DMS, which ensures high effectivity of scripting. It helps us achieve rapid prototyping of tasks and implementation. A high work productivity, which Python brings us, allows to shorten implementation time of COAR-DMS and therefore cut down the costs. The source codes are easy to read which shortens repair time and implementation of new features.

Príklad - vloženie dokumentu do COAR-DMS

Example - inserting document to COAR-DMS

#
# COAR-DMS Dominanz s.r.o. www.coardms.com
# python demo program
# Simple example which add document to COAR-DMS 
# 
import sys
import os,io
import suds
import logging
import unicodedata
import string
from StringIO import StringIO
from suds.client import Client
from warnings import catch_warnings
import datetime
from pycoar import coar
from pycoar import discovery
from pycoar import document
from pycoar import folder
from pycoar import login



coaropt={'host':'192.168.1.1','proto':'http','port':'8888'}
coarinstance = None

def writeDocProc():
    coarinit()
    
    documentName = "Python test document"
    attFileName = '/tmp/attFile.pdf'
    
    # create folder object by COAR-DMS folder path
    f=folder.Folder.getFolderByPath(coarinstance,'/testDocs')
    # create document object
    doc = document.Document(coarinstance)
    # set indexing type, document is indexed by TAGS,METADATA and FILEDATA (document full text)  
    doc.setIndexingType(getIndexingType(coarinstance))
    # set document name
    doc.setParameter('name', documentName)
    desc = 'Document ' + documentName + '  with attached file ' + attFileName + '\n'
    # set document description
    doc.setParameter('description',desc)
    # attaching file 
    doc.attachFile(fileName=attFileName)
    
    #add metadata to document 
    md = dict()    
    md['TYP']='PythonTestDoc'
    md['ID']='8888'
    doc.setMetadata(md)
    
    #set tags 
    t=['Python','program','example','coar-dms']
    doc.setParameter('tags',t)
    
    # save document 
    doc.save()
    # move document to folder 
    doc.moveTo(f)
    # save document
    doc.save()

    coarlogout(coarinstance)
    

def getIndexingType(coarInstance):
    item=coarInstance.createDocumentIndexingTypeEnum()
    it=list()
    it.append(item['TAGS'])
    it.append(item['METADATA'])
    it.append(item['FILEDATA'])
    return it

def coarinit():
    global coarinstance
    try:
        # create object of class Coar
        coarinstance = coar.Coar(**coaropt)
        # create login object inside coar object 
        coarinstance.createLogin()
        # call login
        coarlogin()
    except:
        print sys.exc_info()[1]
        sys.exit(1)

def coarlogin():
    global coarinstance
    try:
        # login to COAR-DMS 
        # 1. par loginname
        # 2. par password
        # 3. par inactivity timeout of session
        coarinstance.getLoginObject().login("coarusr","secret",600)
        return coarinstance
    except:
        print sys.exc_info()[1]
        sys.exit(1)
    
def coarlogout(coarInstance):
    # logout from COAR-DMS
    coarInstance.getLoginObject().logout()

if __name__ == "__main__":
    sys.exit(writeDocProc())

Zdrojovy kód príkladu na stiahnutie insdoc.py.

Source code of example for download insdoc.py.

Príklad - vloženie e-mailu do COAR-DMS

Example - insert e-mail to COAR-DMS

#
# COAR-DMS Dominanz s.r.o. www.coardms.com
# python demo program
# Simple example which read mail from IMAP server and wirte emails as document to COAR-DMS 
# 
import sys
import os,io
import suds
import logging
import unicodedata
import string
from StringIO import StringIO
from suds.client import Client
from warnings import catch_warnings
import datetime
from pycoar import coar
from pycoar import discovery
from pycoar import document
from pycoar import folder
from pycoar import login
import email
import imaplib
import time
from email.header import decode_header



coaropt={'host':'192.168.1.1','proto':'http','port':'8888'}
coarinstance = None

def startProc():
    
    endloop=False
    coarinit()
    
    while endloop == False:
        msgs = readnextemails()
        if msgs != None:
            writeMailMessagesToCoar(msgs)
        time.sleep(10)


def readnextemails():
    mail = mailLogin()
    msgs = getUnreadMailMessages(mail)
    return msgs


def mailLogin():
    try:
        mail = imaplib.IMAP4_SSL('imap.gmail.com')
        mail.login('coar@dominanz.sk', 'secret')
        # Out: list of "folders" aka labels in gmail.
        return mail
    except:
        print sys.exc_info()[1]
        sys.exit(1)

def mailLogout(mail):
    try:
        mail.close()
    except:
        print sys.exc_info()[1]
        sys.exit(1)




def getUnreadMailMessages(mail):
        mail.select("inbox") # connect to inbox.
        (retcode, messages) = mail.search(None, '(UNSEEN)')
        if retcode == 'OK':
            nums = messages[0].split(' ') 
            print nums,' ',len(nums),' ',type(nums)
            if len(nums) == 1 and nums[0]==''  :
                return None
            msgs=list()
            for num in nums:
                print 'Read message:', num
                typ, data = mail.fetch(num,'(RFC822)')
                msg = email.message_from_string(data[0][1])
                typ, data = mail.store(num,'+FLAGS','\\Seen')
                msgs.append(msg)
            return msgs

def writeMailMessagesToCoar(msgs):
    
    try:
        coarInstance = coarlogin()
        for msg in msgs:
            writeMessageToCoar(coarInstance, msg)
        coarlogout(coarInstance)
    except:
        print sys.exc_info()[1]
        sys.exit(1)

def writeMessageToCoar(coarInstance,msg):

    #print '@@@@@'
    #print msg
    #print '@@@@@'
    
    m_to =  msg['To']
    m_from =  msg['From']
    m_subject = getSubject(msg)

    f=folder.Folder.getFolderByPath(coarInstance,'/e-mail')
    doc = document.Document(coarInstance)
    doc.setIndexingType(getEmailIndexingType(coarInstance))
    doc.setParameter('name', m_subject)
    desc = 'FROM: ' + m_from + '\n'
    desc +=  'SUBJECT ' + m_subject + '\n'
    doc.setParameter('description',desc)
    rtdata = getMailBody(msg)
    rts = StringIO(rtdata)
    doc.setRichTextData(rtData=rts)
    feml=open('/tmp/att/Mail.eml','wb')
    feml.write(msg.__str__())
    feml.close()
    doc.attachFile(fileName='/tmp/att/Mail.eml')
    
    atts = getAttachments(msg)
    doc.turnVersioningOff()
    doc.save()
    doc.moveTo(f)
    for att in atts:
        doca=document.Document(coarInstance)
        doca.setParameter('name', att['fname'])
        f=open('/tmp/att/'+att['fname'],'wb')
        f.write(att['stream'].read())
        f.close()
        doca.attachFile(fileName='/tmp/att/'+att['fname'])
        doca.turnVersioningOff()
        doc.addNestedDocument(document=doca)
        doca.save()
    if atts:
        doc.save()
    
    
def getSubject(msg):
    sl = decode_header(msg['Subject'])
    text=""
    for t in sl:
        if t[1] != None:
            text=text+' '+unicode(t[0],t[1])
        else:
            text=text+' '+unicode(t[0])
    return text
    
def getMailBody(msg):

    text=""
    texthtml=None

    inmalt=0
    if msg.is_multipart():
        for part in msg.walk():
            ct=part.get_content_type()
            if ct == 'multipart/alternative':
                inmalt=1
                continue
            if inmalt == 1:
                if ct == 'text/plain':
                    text = part.get_payload(decode=True)
                if ct == 'text/html':
                    texthtml = part.get_payload()
            elif ct == 'text/plain':
                text= text + part.get_payload(decode=True)        

    else:
        #print '@@@@@'
        #print msg.get_payload(decode=True)
        #print '@@@@@'
        text= msg.get_payload(decode=True)        

    if texthtml != None:
        return texthtml
    
    text = createHtml(text)
    
    return text
    
    
def createHtml(text):
    
    html = '<html>\
<head>\
<meta charset="UTF-8">\
</head>\
<body><pre>'

    html = html + text
    html = html + '</pre></body></html>'    
    return html

def getAttachments(msg):
    try:
        atts=list()
        if msg.is_multipart():
            for part in msg.walk():
                cd = part.get('content-disposition')
                if cd != None:
                    _disp = cd.strip().split(";")
                    if _disp[0].lower() == "attachment" or _disp[0].lower() == "inline":
                        att=dict()
                        fn=part.get_filename()
                        fdata = part.get_payload(decode=True)
                        strm = StringIO(fdata)
                        strm.content_type = part.get_content_type()
                        strm.size = len(fdata)
                        strm.seek(0,io.SEEK_SET)
                        att['stream']=strm
                        att['fname']=fn
                        
                        atts.append(att)
                        
        return atts
    except:
        print sys.exc_info()[1]
        sys.exit(1)
                    

def getEmailIndexingType(coarInstance):
    iten=coarInstance.createDocumentIndexingTypeEnum()
    it=list()
    it.append(iten['TAGS'])
    it.append(iten['METADATA'])
    it.append(iten['RICHTEXTDATA'])
    return it

def coarinit():
    global coarinstance
    try:
        coarinstance = coar.Coar(**coaropt)
        coarinstance.createLogin()
    except:
        print sys.exc_info()[1]
        sys.exit(1)

def coarlogin():
    global coarinstance
    try:
        coarinstance.getLoginObject().login("coarusr","secret",600)
        return coarinstance
    except:
        print sys.exc_info()[1]
        sys.exit(1)
    
def coarlogout(coarInstance):
    coarInstance.getLoginObject().logout()

if __name__ == "__main__":
    sys.exit(startProc())

Zdrojovy kód príkladu na stiahnutie mailproc.py.

Source code of example for download mailproc.py.

Back to top


Programovanie COAR-DMS v jazyku C++

Programming COAR-DMS in C++

COAR-DMS server poskytuje lokálne API v jazyku C++. Program, ktorý použije COAR-DMS API si prilinkuje dynamickú knižnicu, ktorá obsahuje objekty a funkcie zabezpečujúce bezpečný prístup k úložisku COAR-DMS. Prístup je bezpečný pre multithreaded a multiprocess prostredie. Programy s COAR-DMS C++ API poskytujú používateľovi vysoký výkon v porovnaní s WebServices API a Python API.

COAR-DMS server provides a local API in C++ language. A program, that will use COAR-DMS API, will link a dynamic library, which contains objects and functions providing safe access to COAR-DMS storage. The access is safe for multithreaded and multiprocess environment. The programs with COAR-DMS C++ API will provide a high performance to user in comparison with WebServices API and Python API.

// Code fragments in C++ language for COAR-DMS 

// create base object of COAR-DMS API 
coar* coarinstance = new coar();

// authorize by authentication backend 
coarinstance->getLoginInfo()->loginUser("userName","password");

// create document object
document* doc = new document (coarinstance);

// set document name
doc->setName ("name_of_the_document");
// set document description as wstring 
doc->setDescription (L"описание документа - Document description");


// setting metadata
mdmap_t metadata;
metadata.insert(pair<string,string>("meta","data"));
doc->setMetaData(metadata);

// attaching file name 
fstream file;
file.open ("./path/to/the/file.txt");
istream& docdata = file;
// attach file and set MIME type
doc->setDocData(docdata, "text/plain");


// save document 
doc->save();



// Search documents
// get searchEngine object
searchEngine* se = coarinstance->getSearchEngine();

// buid query
string query = "select name from document where name LIKE ’new doc%’ order by id;";
// set conditions 
int conditions = SEARCH_CASESENSITIVE | SEARCH_ACCENTSENSITIVE;
//run search 
se->search(query, true, conditions);

// read cursor 
row r;
se->setCurrentIndex(INDEX_UNDEF);
string result = "";
while(se->getNextResultRow(r) != -1)
{
  for(row::iterator it = r.begin(); it != r.end();it++)
  {
    result += it->second.c_str() + (string)"\t\t";
  }
  result += "\n";
}

Back to top


COAR-DMS WebServices SOAP API

COAR-DMS WebServices SOAP API

COAR-DMS server poskytuje univerzálne sieťové rozhranie založené na štandartoch WSDL a SOAP. Súčasťou COAR-DMS servera je zabudovaný HTTP server cez, ktorý sú poskytované služby definované v coar.wsdl a služby protokolu WebDAV. Na základe definície coar.wsdl je možné vygenerovať klientov pre rôzne jazyky napr. pre Java a C#.

Príkaz ktorý vygeneruje WS klienta pre jazyk Java je wsimport -target 2.2 -d outdir -p com.coardms.coar.ws -s outdirsrc coar.wsdl

Pre jazyk C# existuje podpora priamo vo Visual Studiu kde pridáme cez Service References url, na ktorej je coar.wsdl, čo vygeneruje klienta pre COAR-DMS WS API.

COAR-DMS server provides a universal web interface based on WSDL standards and SOAP. A part of COAR-DMS server is a built-in HTTP server through which services defined in coar.wsdl and WebDAV protocol services are provided. Based on the definition coar.wsdl, it is possible to generate clients for various languages, e. g. for Java and C#.

A command which generates WS client for Java is wsimport -target 2.2 -d outdir -p com.coardms.coar.ws -s outdirsrc coar.wsdl

The support for C# is built right in Visual Studio where we add a URL, where coar.wsdl is stored, through Service References, which generates a client for COAR-DMS WS API

Back to top