Analysis Tool for huge logfiles
Antwort
02.11.16 19:35

 Dear XDK-Users,

I have a few use-cases where I am logging data with the demoDataLogger. When I am logging for 15 to 20 houres, there are Logfiles with up to 8.000.000 lines. Excel is only able to handle up to 1.048.576. By now I am using a trial version of "NI DIAdem" for analysis, but this can't be the best practice.

Can you give me some hints, what Software are you using?

 

0 (0 Stimmen)
RE: Analysis Tool for huge logfiles
Antwort
03.11.16 12:10 als Antwort auf Stephan Niederwald.

Hello Stephan,

the best tool to analyse measurement data is undoubtedly Matlab or open source alternatives like Octave.

I hope this could help you.

Kind regards,
Manuel

0 (0 Stimmen)
RE: Analysis Tool for huge logfiles
Antwort
24.11.16 09:53 als Antwort auf Manuel Cerny.

Hi, I had the same problem and wrote a matlab function to load the data in in the fastest possible manner.

You just have to hand the filename(with path) as a cell called "Str" and the size of the Buffer space. You have to test what Buffersetting suits you. For me it is 5e4

The code reads in the header and then loads the data in batches to parse them to the nessesary format.
The code saves the loaded data in a file called scannedData.mat and gives it back as a array named the same.

Other outputs are
FL: filelength
datacol: number of columns
Rdt_hdr: array with the axis names from the file
dlmtr: number of delimiters to check if there is a  empty collumn in the file

Sorry for the bad documentation.

Greeting, Marius
 

Here is the code:

function [ Rdt_hdr, dlmtr, scannedData,FL, datacol ] = fload_data_dtllgr( Str, bufferSize )
%UNTITLED2 Summary of this function goes here
%   Detailed explanation goes here
%% Reading in large batches into memory, incrementing to end-of-line, sscanf

fid = fopen(Str{1});
%bufferSize = 4.5e4;
eol = sprintf('\n');

%% here goes the scan of the header


dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
    dataIncrement(end+1) = fread(fid,1,'uint8=>char');  %This can be slightly optimized
end
    [Rdt_hdr, dlmtr]=strsplit(dataIncrement,';');

 
%% here goes the scan of the data
    
dataBatch = fread(fid,bufferSize,'uint8=>char')';
dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
    dataIncrement(end+1) = fread(fid,1,'uint8=>char');  %This can be slightly optimized
end
data = [dataBatch dataIncrement];


scannedData=[];
%scannedData=zeros(500000,2,2);
filename='scannedData.mat';
save(filename,'scannedData','-v7.3');
    
while ~isempty(data)
    
    data_tab=repmat(';%d',1, (length(dlmtr)-1));
    data_format=strcat('%d',data_tab);
   
    str_tab=repmat(';%s',1, (length(dlmtr)-1));
    str_format=strcat('%s',str_tab);
    
    fprintf('test');

    i=1;
    
    while ~isempty(data) %&& i<414
    scannedData =[scannedData; reshape(sscanf(data, data_format),1,[])]; %;
    a=(textscan(data, '%[^\n]',1));
    scannedstr=char(a{1})
    data=regexprep(data, scannedstr, '','once');
    i=i+1;
        if all(isstrprop(data, 'wspace'))
            data=[];
        end
    end
%
%  Here the data is dumped into a file    
    if size(scannedData,1)>10000

        a=load('scannedData.mat')
        scannedData=[a.scannedData; scannedData];
        save(filename,'scannedData','-v7.3','-append');
        scannedData=[];
    end
%     
    % next batch is read in
    dataBatch = fread(fid,bufferSize,'uint8=>char')';
    dataIncrement = fread(fid,1,'uint8=>char');
    while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
        dataIncrement(end+1) = fread(fid,1,'uint8=>char');%This can be slightly optimized
    end
    fprintf('test2')
    data = [dataBatch dataIncrement];

end
fclose(fid);

    FL=length(scannedData);
    datacol= size(scannedData,2);

     a=load('scannedData.mat')
     scannedData=[a.scannedData scannedData];
end
0 (0 Stimmen)
RE: Analysis Tool for huge logfiles
Antwort
24.11.16 10:09 als Antwort auf Marius Wolf.

Hello Marius,

thank you very much for sharing your code with the community. There were many requests for exactly such analysis code.

I moved this thread to the project exchange so it can be found easily.

Kind regards,
Manuel

0 (0 Stimmen)
RE: Analysis Tool for huge logfiles
Antwort
25.11.16 13:24 als Antwort auf Manuel Cerny.
You also might look at Splunk, Knime (commercial) or use some form of R for processing even larger files.
0 (0 Stimmen)