files_merger

Gem to merge .csv and .txt files! You can sort by column, remove duplicates chosing the priority by order, change delimiter and read parameters from external file if you want.


License
MIT
Install
gem install files_merger -v 1.0.5

Documentation

FilesMerger

FilesMerger is a Gem that facilitates the merger of .csv and .txt files, providing a simple way for doing so.

With this gem you can:

  • Set a new delimiter for the merged file;
  • Set the complete path for each file or just the file names and an root diretory;
  • Set a name for the final merged file;
  • Sort the lines by an espeficic column;
  • Eliminate duplicates chosing priority by order;
    • Soon you'll be able to eliminate all duplicates.
  • Ensure that all the files have the same number of columns.

Compatibility

Until now, no problems with any version of Ruby.

Installation

Bundle

Add this line of code to your Gemfile application:

gem 'files_merger'

Then execute:

bundle

Manually

Execute the following code:

gem install files_merger

Usage

The FilesMerger is very simple to use, see:

First, you just need to decide if the parameters are going to be inputted via external file or code through the method set_file_parameters. It accepts two parameters:

  • option
    • true if the parameters will be inputted from an external file;
    • false if not.
  • path
    • the path to the external file.

Example:

FilesMerger.set_file_parameters option: true, path: "parameters.rb"

If any parameters are passed, the default is option: false, path: ""

If the chosen option was true, the file must follow the example below:

files = ['test1.csv', 'test2.csv']

delimiter 				";"
new_delimiter			" " #tab
files_to_merge    origin: "C:/Windows/User/Desktop/", files: files
#files_to_merge		origin: "C:/Windows/User/Desktop/", files: ['teste1.csv', 'teste1.csv']
final_file_name		"/Desktop/FINAL_FILE"
set_sort 				  option: true, column:"column_name"
elim_duplicity    option: true, key: "column_name", order: "fend"
limit_of_columns	true
file_extension		:csv

But, if you are going to use via code, follow the example:

FilesMerger.delimiter ";"
FilesMerger.new_delimiter "|"
FilesMerger.files_to_merge    origin: "C:/Windows/User/Desktop/", files: files
...

or

FilesMerger.delimiter(";")
...

It's Ruby syntax. You decide.

After set the parameters, call the build mehod, responsable for all the magic.

FilesMerger.build

Methods

These are the are the accepted methods, for now:

  • delimiter
    • Set delimiter in the files to merge.
    • Examples of Values:
      • ";"
      • ","
      • " " or in code "\t"- tab
      • "|"

Ex: FilesMerger.delimiter ","


  • new_delimiter
    • Set a new delimiter for the file after the merging;
    • If it's not used, the the delimiter will be the same used in the files.
    • Examples of Values:
      • ";"
      • ","
      • " " or in code "\t"- tab
      • "|"

Ex: FilesMerger.new_delimiter ";"


  • files_to_merge
    • Set the files to be merged;
    • Receives the an array with the names of all the files to be merged and the root folder of the files;
    • If all are in the same folder, you can just pass a root path and an array with just the names and extension of the files.

Ex:

files = ['teste1.csv', 'teste2.csv']

FilesMerger.files_to_merge(origin: Dir.home() << "/Desktop/", files: files)

or

files = ['C:/Windows/User/Desktop/teste1.csv', 'C:/Windows/User/Desktop/teste2.csv']

FilesMerger.files_to_merge(origin: "", files: files)

  • final_file_name
    • Set the name of the file resulted from the merging;
    • If not used, the default name will be "MERGED_YYYYmmDDHHMM.csv" (or .txt);

  • set_sort
    • Sorts the file lines;
    • The sorting will always be from up(smaller) to down(bigger);
    • If not used, nothing will be sorted.
    • parameters:
      • option - if you want to sort the file lines;
        • values: true or false
      • column - the column to be used as criterion for the sorting;

Ex: FilesMerger.set_sort(option: true, column:"column_name")


  • elim_duplicity
    • Eliminates duplicated lines;
    • Parameters:
      • option - if you want to eliminate duplicates;
        • Values: true or false.
      • key - the main value to be used as criterion.
      • order - the order used to go through the lines. If from above, the firt occurance will be kept, if from below, the last one will be kept. Pay attention if you used the sort option.
        • Values:
          • fend - from end.
          • fstart - from start.

Ex: FilesMerger.elim_duplicity(option: true, key: "columns_name", order: "fend")


  • limit_of_columns
    • Determinates if all files must respect a limit of columns;
    • Parameters:
      • Values: true or false;

Ex: FilesMerger.limit_of_columns(true)


  • file_extension
    • Set the extension for the file resulted from the mergin;
    • Parameters:
      • Values: :csv or :txt.

Ex: FilesMerger.file_extension(:csv) or FilesMerger.file_extension(:txt)