Stack data from multiple delimited files into one, with variable columns












3














I'm new to scripting. Got struck with a file merge issue in Unix. Was looking for some direction and stumbled upon this site. I saw many great posts and replies but couldn't find a solution to my issue. Greatly appreciate any help..



I have three csv files -> Apex_10_Latest.csv, Apex_20_Latest.csv, Apex_30_Latest.csv. Number of columns are varying in these 3 files. Typically the latest file, based on the numbering, might have some new columns appended to the end. So I want to take the latest header and stack the data from all the 3 files into a new file Apex.csv. When stacking the data from older file which might have less columns than latest file, I want the data to be populated as null with appropriate delimiters..



Also this has to be done recursively for a multiple set of files (3 each), all in the same folder.
- Apex_10_Latest.csv,Apex_20_Latest.csv,Apex_30_Latest.csv - merged into Apex.csv
- Code_10_Latest.csv,Code_20_Latest.csv,Code_30_Latest.csv - merged into Code.csv
- Trans_10_Latest.csv,Trans_20_Latest.csv,Trans_30_Latest.csv - merged into Trans.csv



Following is the format of the source files and expected target file...
SOURCE FILES:




  • Apex_30_Latest.csv:
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6


  • Apex_20_Latest.csv:
    A,B,C

    4,5,6

    5,6,7

    6,7,8


  • Apex_10_Latest.csv:
    A,B

    7,8

    8,9

    9,10



EXPECTED TARGET FILE:




  • Apex.csv
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6

    4,5,6,,

    5,6,7,,

    6,7,8,,

    7,8,,,

    8,9,,,

    9,10,,,


Thanks...










share|improve this question
























  • Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
    – Wildcard
    Oct 23 '15 at 2:34










  • A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
    – Wildcard
    Oct 23 '15 at 5:37
















3














I'm new to scripting. Got struck with a file merge issue in Unix. Was looking for some direction and stumbled upon this site. I saw many great posts and replies but couldn't find a solution to my issue. Greatly appreciate any help..



I have three csv files -> Apex_10_Latest.csv, Apex_20_Latest.csv, Apex_30_Latest.csv. Number of columns are varying in these 3 files. Typically the latest file, based on the numbering, might have some new columns appended to the end. So I want to take the latest header and stack the data from all the 3 files into a new file Apex.csv. When stacking the data from older file which might have less columns than latest file, I want the data to be populated as null with appropriate delimiters..



Also this has to be done recursively for a multiple set of files (3 each), all in the same folder.
- Apex_10_Latest.csv,Apex_20_Latest.csv,Apex_30_Latest.csv - merged into Apex.csv
- Code_10_Latest.csv,Code_20_Latest.csv,Code_30_Latest.csv - merged into Code.csv
- Trans_10_Latest.csv,Trans_20_Latest.csv,Trans_30_Latest.csv - merged into Trans.csv



Following is the format of the source files and expected target file...
SOURCE FILES:




  • Apex_30_Latest.csv:
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6


  • Apex_20_Latest.csv:
    A,B,C

    4,5,6

    5,6,7

    6,7,8


  • Apex_10_Latest.csv:
    A,B

    7,8

    8,9

    9,10



EXPECTED TARGET FILE:




  • Apex.csv
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6

    4,5,6,,

    5,6,7,,

    6,7,8,,

    7,8,,,

    8,9,,,

    9,10,,,


Thanks...










share|improve this question
























  • Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
    – Wildcard
    Oct 23 '15 at 2:34










  • A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
    – Wildcard
    Oct 23 '15 at 5:37














3












3








3







I'm new to scripting. Got struck with a file merge issue in Unix. Was looking for some direction and stumbled upon this site. I saw many great posts and replies but couldn't find a solution to my issue. Greatly appreciate any help..



I have three csv files -> Apex_10_Latest.csv, Apex_20_Latest.csv, Apex_30_Latest.csv. Number of columns are varying in these 3 files. Typically the latest file, based on the numbering, might have some new columns appended to the end. So I want to take the latest header and stack the data from all the 3 files into a new file Apex.csv. When stacking the data from older file which might have less columns than latest file, I want the data to be populated as null with appropriate delimiters..



Also this has to be done recursively for a multiple set of files (3 each), all in the same folder.
- Apex_10_Latest.csv,Apex_20_Latest.csv,Apex_30_Latest.csv - merged into Apex.csv
- Code_10_Latest.csv,Code_20_Latest.csv,Code_30_Latest.csv - merged into Code.csv
- Trans_10_Latest.csv,Trans_20_Latest.csv,Trans_30_Latest.csv - merged into Trans.csv



Following is the format of the source files and expected target file...
SOURCE FILES:




  • Apex_30_Latest.csv:
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6


  • Apex_20_Latest.csv:
    A,B,C

    4,5,6

    5,6,7

    6,7,8


  • Apex_10_Latest.csv:
    A,B

    7,8

    8,9

    9,10



EXPECTED TARGET FILE:




  • Apex.csv
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6

    4,5,6,,

    5,6,7,,

    6,7,8,,

    7,8,,,

    8,9,,,

    9,10,,,


Thanks...










share|improve this question















I'm new to scripting. Got struck with a file merge issue in Unix. Was looking for some direction and stumbled upon this site. I saw many great posts and replies but couldn't find a solution to my issue. Greatly appreciate any help..



I have three csv files -> Apex_10_Latest.csv, Apex_20_Latest.csv, Apex_30_Latest.csv. Number of columns are varying in these 3 files. Typically the latest file, based on the numbering, might have some new columns appended to the end. So I want to take the latest header and stack the data from all the 3 files into a new file Apex.csv. When stacking the data from older file which might have less columns than latest file, I want the data to be populated as null with appropriate delimiters..



Also this has to be done recursively for a multiple set of files (3 each), all in the same folder.
- Apex_10_Latest.csv,Apex_20_Latest.csv,Apex_30_Latest.csv - merged into Apex.csv
- Code_10_Latest.csv,Code_20_Latest.csv,Code_30_Latest.csv - merged into Code.csv
- Trans_10_Latest.csv,Trans_20_Latest.csv,Trans_30_Latest.csv - merged into Trans.csv



Following is the format of the source files and expected target file...
SOURCE FILES:




  • Apex_30_Latest.csv:
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6


  • Apex_20_Latest.csv:
    A,B,C

    4,5,6

    5,6,7

    6,7,8


  • Apex_10_Latest.csv:
    A,B

    7,8

    8,9

    9,10



EXPECTED TARGET FILE:




  • Apex.csv
    A,B,C,D

    1,2,3,4

    2,3,4,5

    3,4,5,6

    4,5,6,,

    5,6,7,,

    6,7,8,,

    7,8,,,

    8,9,,,

    9,10,,,


Thanks...







text-processing files awk csv merge






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 23 '15 at 4:43









Wildcard

22.7k962164




22.7k962164










asked Oct 23 '15 at 2:26









wamshiwamshi

162




162












  • Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
    – Wildcard
    Oct 23 '15 at 2:34










  • A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
    – Wildcard
    Oct 23 '15 at 5:37


















  • Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
    – Wildcard
    Oct 23 '15 at 2:34










  • A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
    – Wildcard
    Oct 23 '15 at 5:37
















Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
– Wildcard
Oct 23 '15 at 2:34




Welcome to stackexchange! Great first question. I believe the commands cut and paste can be configured to do what you want, but you may need to use a more fully featured text processing tool such as awk.
– Wildcard
Oct 23 '15 at 2:34












A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
– Wildcard
Oct 23 '15 at 5:37




A question: Do the headers always start the same? Or might you have A,B,D in one file? Or C,D,E? In other words, is column 3 in one file the same as column 3 in every other file that has 3 columns? (And for every other number in place of 3?)
– Wildcard
Oct 23 '15 at 5:37










4 Answers
4






active

oldest

votes


















0














  cat $(ls -1 Apex_*_Latest.csv | sort -nr -k2 -t'_') | awk -F"," '{
if (NR==1){
nfm=NF};
for (i=1;i<=nfm;i++) {
printf $i","};
print ""}' >Apex.csv


You can reverse sort filenames based on second field (30,20,10..) and cat the files so that lines with highest no of columns comes first.



Then with awk you can get the highest no of columns NF from first line NR
if (NR==1){nfm=NF}



Then Run a for loop till i (column number) greater or equal nfm to print values in the field no i followed by ','. if there is no value for ith field(happen when columns are less than latest files )it will print just ,.






share|improve this answer





















  • You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
    – RobertL
    Oct 26 '15 at 3:24





















0














I think the previous answer is the best, I just show a different approach since I haven't used awk in years, since perl and python became big. I think awk is fine, it's just that a mixture of shell, sed, python and/or perl has suited my work better.



However, in this case, I think anyone can see that the awk solution is more succinct and easier to read. Come to think of it, I think I've heard awk referred to as the command line spreadsheet, or something like that. :-)



Based on the original post, I've chosen to let the ls command sort the filenames by file modification time, as opposed to relying on the filename format. Six of one, half a dozen of the other.



So just for comparison, here's an efficient, portable, modular(?!), pure shell version of the solution:



    #!/bin/sh

get_commas() {
sed 's/[^,]//g; 1q' "$@"
}

get_extra_commas() {
local maxcommas="$1"
local file="$2"
local new_commas=$(get_commas "$file")
local extra_commas=""
while [ "${new_commas}${extra_commas}" != "${maxcommas}" ]
do
extra_commas=",$extra_commas"
done
echo "$extra_commas"
}

unset header
ls -t Apex*.csv |
while read filename
do
if [ -z "$header" ]
then
header="$(sed 1q "$filename")"
commas=$(echo "$header" | get_commas)
echo "$header"
fi
extra_commas=$(get_extra_commas $commas "$filename")
sed "1d; s/$/$extra_commas/" "$filename"
done





share|improve this answer





















  • The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
    – Wildcard
    Oct 27 '15 at 6:19










  • More like omnipotent demon! :-)
    – RobertL
    Oct 27 '15 at 6:45



















0














Here is an answer implemented in Miller:



$ cat rect.mlr
for (k,v in $*) {
@fields[k] = v; # retain already-seen field names
}
for (k,v in @fields) {
if (isabsent($[k])) {
$[k] = "";
}
}

$ mlr --csvlite put -f rect.mlr Apex_30_Latest.csv Apex_20_Latest.csv Apex_10_Latest.csv
A,B,C,D
1,2,3,4
2,3,4,5
3,4,5,6
4,5,6,
5,6,7,
6,7,8,
7,8,,
8,9,,
9,10,,


Since Miller handles named columns intrinsically, header-line management becomes simpler.






share|improve this answer





























    0
















    with Miller (http://johnkerl.org/miller/doc/) as usual is very easy



    mlr --csv unsparsify Apex_*_Latest.csv


    gives you



    A,B,C,D
    1,2,3,4
    2,3,4,5
    3,4,5,6
    4,5,6,
    5,6,7,
    6,7,8,
    7,8,,
    8,9,,
    9,10,,





    share|improve this answer





















      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "106"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f238046%2fstack-data-from-multiple-delimited-files-into-one-with-variable-columns%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














        cat $(ls -1 Apex_*_Latest.csv | sort -nr -k2 -t'_') | awk -F"," '{
      if (NR==1){
      nfm=NF};
      for (i=1;i<=nfm;i++) {
      printf $i","};
      print ""}' >Apex.csv


      You can reverse sort filenames based on second field (30,20,10..) and cat the files so that lines with highest no of columns comes first.



      Then with awk you can get the highest no of columns NF from first line NR
      if (NR==1){nfm=NF}



      Then Run a for loop till i (column number) greater or equal nfm to print values in the field no i followed by ','. if there is no value for ith field(happen when columns are less than latest files )it will print just ,.






      share|improve this answer





















      • You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
        – RobertL
        Oct 26 '15 at 3:24


















      0














        cat $(ls -1 Apex_*_Latest.csv | sort -nr -k2 -t'_') | awk -F"," '{
      if (NR==1){
      nfm=NF};
      for (i=1;i<=nfm;i++) {
      printf $i","};
      print ""}' >Apex.csv


      You can reverse sort filenames based on second field (30,20,10..) and cat the files so that lines with highest no of columns comes first.



      Then with awk you can get the highest no of columns NF from first line NR
      if (NR==1){nfm=NF}



      Then Run a for loop till i (column number) greater or equal nfm to print values in the field no i followed by ','. if there is no value for ith field(happen when columns are less than latest files )it will print just ,.






      share|improve this answer





















      • You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
        – RobertL
        Oct 26 '15 at 3:24
















      0












      0








      0






        cat $(ls -1 Apex_*_Latest.csv | sort -nr -k2 -t'_') | awk -F"," '{
      if (NR==1){
      nfm=NF};
      for (i=1;i<=nfm;i++) {
      printf $i","};
      print ""}' >Apex.csv


      You can reverse sort filenames based on second field (30,20,10..) and cat the files so that lines with highest no of columns comes first.



      Then with awk you can get the highest no of columns NF from first line NR
      if (NR==1){nfm=NF}



      Then Run a for loop till i (column number) greater or equal nfm to print values in the field no i followed by ','. if there is no value for ith field(happen when columns are less than latest files )it will print just ,.






      share|improve this answer












        cat $(ls -1 Apex_*_Latest.csv | sort -nr -k2 -t'_') | awk -F"," '{
      if (NR==1){
      nfm=NF};
      for (i=1;i<=nfm;i++) {
      printf $i","};
      print ""}' >Apex.csv


      You can reverse sort filenames based on second field (30,20,10..) and cat the files so that lines with highest no of columns comes first.



      Then with awk you can get the highest no of columns NF from first line NR
      if (NR==1){nfm=NF}



      Then Run a for loop till i (column number) greater or equal nfm to print values in the field no i followed by ','. if there is no value for ith field(happen when columns are less than latest files )it will print just ,.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Oct 23 '15 at 9:02









      7171u7171u

      85548




      85548












      • You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
        – RobertL
        Oct 26 '15 at 3:24




















      • You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
        – RobertL
        Oct 26 '15 at 3:24


















      You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
      – RobertL
      Oct 26 '15 at 3:24






      You can remove everything on the first line up to awk and insert $(ls -r Apex_*_latest.csv) on the last line between the last single quote ' and the >. That would eliminate cat and sort, be less code, and easier to understand.
      – RobertL
      Oct 26 '15 at 3:24















      0














      I think the previous answer is the best, I just show a different approach since I haven't used awk in years, since perl and python became big. I think awk is fine, it's just that a mixture of shell, sed, python and/or perl has suited my work better.



      However, in this case, I think anyone can see that the awk solution is more succinct and easier to read. Come to think of it, I think I've heard awk referred to as the command line spreadsheet, or something like that. :-)



      Based on the original post, I've chosen to let the ls command sort the filenames by file modification time, as opposed to relying on the filename format. Six of one, half a dozen of the other.



      So just for comparison, here's an efficient, portable, modular(?!), pure shell version of the solution:



          #!/bin/sh

      get_commas() {
      sed 's/[^,]//g; 1q' "$@"
      }

      get_extra_commas() {
      local maxcommas="$1"
      local file="$2"
      local new_commas=$(get_commas "$file")
      local extra_commas=""
      while [ "${new_commas}${extra_commas}" != "${maxcommas}" ]
      do
      extra_commas=",$extra_commas"
      done
      echo "$extra_commas"
      }

      unset header
      ls -t Apex*.csv |
      while read filename
      do
      if [ -z "$header" ]
      then
      header="$(sed 1q "$filename")"
      commas=$(echo "$header" | get_commas)
      echo "$header"
      fi
      extra_commas=$(get_extra_commas $commas "$filename")
      sed "1d; s/$/$extra_commas/" "$filename"
      done





      share|improve this answer





















      • The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
        – Wildcard
        Oct 27 '15 at 6:19










      • More like omnipotent demon! :-)
        – RobertL
        Oct 27 '15 at 6:45
















      0














      I think the previous answer is the best, I just show a different approach since I haven't used awk in years, since perl and python became big. I think awk is fine, it's just that a mixture of shell, sed, python and/or perl has suited my work better.



      However, in this case, I think anyone can see that the awk solution is more succinct and easier to read. Come to think of it, I think I've heard awk referred to as the command line spreadsheet, or something like that. :-)



      Based on the original post, I've chosen to let the ls command sort the filenames by file modification time, as opposed to relying on the filename format. Six of one, half a dozen of the other.



      So just for comparison, here's an efficient, portable, modular(?!), pure shell version of the solution:



          #!/bin/sh

      get_commas() {
      sed 's/[^,]//g; 1q' "$@"
      }

      get_extra_commas() {
      local maxcommas="$1"
      local file="$2"
      local new_commas=$(get_commas "$file")
      local extra_commas=""
      while [ "${new_commas}${extra_commas}" != "${maxcommas}" ]
      do
      extra_commas=",$extra_commas"
      done
      echo "$extra_commas"
      }

      unset header
      ls -t Apex*.csv |
      while read filename
      do
      if [ -z "$header" ]
      then
      header="$(sed 1q "$filename")"
      commas=$(echo "$header" | get_commas)
      echo "$header"
      fi
      extra_commas=$(get_extra_commas $commas "$filename")
      sed "1d; s/$/$extra_commas/" "$filename"
      done





      share|improve this answer





















      • The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
        – Wildcard
        Oct 27 '15 at 6:19










      • More like omnipotent demon! :-)
        – RobertL
        Oct 27 '15 at 6:45














      0












      0








      0






      I think the previous answer is the best, I just show a different approach since I haven't used awk in years, since perl and python became big. I think awk is fine, it's just that a mixture of shell, sed, python and/or perl has suited my work better.



      However, in this case, I think anyone can see that the awk solution is more succinct and easier to read. Come to think of it, I think I've heard awk referred to as the command line spreadsheet, or something like that. :-)



      Based on the original post, I've chosen to let the ls command sort the filenames by file modification time, as opposed to relying on the filename format. Six of one, half a dozen of the other.



      So just for comparison, here's an efficient, portable, modular(?!), pure shell version of the solution:



          #!/bin/sh

      get_commas() {
      sed 's/[^,]//g; 1q' "$@"
      }

      get_extra_commas() {
      local maxcommas="$1"
      local file="$2"
      local new_commas=$(get_commas "$file")
      local extra_commas=""
      while [ "${new_commas}${extra_commas}" != "${maxcommas}" ]
      do
      extra_commas=",$extra_commas"
      done
      echo "$extra_commas"
      }

      unset header
      ls -t Apex*.csv |
      while read filename
      do
      if [ -z "$header" ]
      then
      header="$(sed 1q "$filename")"
      commas=$(echo "$header" | get_commas)
      echo "$header"
      fi
      extra_commas=$(get_extra_commas $commas "$filename")
      sed "1d; s/$/$extra_commas/" "$filename"
      done





      share|improve this answer












      I think the previous answer is the best, I just show a different approach since I haven't used awk in years, since perl and python became big. I think awk is fine, it's just that a mixture of shell, sed, python and/or perl has suited my work better.



      However, in this case, I think anyone can see that the awk solution is more succinct and easier to read. Come to think of it, I think I've heard awk referred to as the command line spreadsheet, or something like that. :-)



      Based on the original post, I've chosen to let the ls command sort the filenames by file modification time, as opposed to relying on the filename format. Six of one, half a dozen of the other.



      So just for comparison, here's an efficient, portable, modular(?!), pure shell version of the solution:



          #!/bin/sh

      get_commas() {
      sed 's/[^,]//g; 1q' "$@"
      }

      get_extra_commas() {
      local maxcommas="$1"
      local file="$2"
      local new_commas=$(get_commas "$file")
      local extra_commas=""
      while [ "${new_commas}${extra_commas}" != "${maxcommas}" ]
      do
      extra_commas=",$extra_commas"
      done
      echo "$extra_commas"
      }

      unset header
      ls -t Apex*.csv |
      while read filename
      do
      if [ -z "$header" ]
      then
      header="$(sed 1q "$filename")"
      commas=$(echo "$header" | get_commas)
      echo "$header"
      fi
      extra_commas=$(get_extra_commas $commas "$filename")
      sed "1d; s/$/$extra_commas/" "$filename"
      done






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Oct 24 '15 at 22:25









      RobertLRobertL

      4,808624




      4,808624












      • The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
        – Wildcard
        Oct 27 '15 at 6:19










      • More like omnipotent demon! :-)
        – RobertL
        Oct 27 '15 at 6:45


















      • The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
        – Wildcard
        Oct 27 '15 at 6:19










      • More like omnipotent demon! :-)
        – RobertL
        Oct 27 '15 at 6:45
















      The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
      – Wildcard
      Oct 27 '15 at 6:19




      The analogies that have stuck with me from a variety of sources are: sed is the assembly language of text processing; awk is the command line spreadsheet tool; perl is the omnipotent god of text processing. :)
      – Wildcard
      Oct 27 '15 at 6:19












      More like omnipotent demon! :-)
      – RobertL
      Oct 27 '15 at 6:45




      More like omnipotent demon! :-)
      – RobertL
      Oct 27 '15 at 6:45











      0














      Here is an answer implemented in Miller:



      $ cat rect.mlr
      for (k,v in $*) {
      @fields[k] = v; # retain already-seen field names
      }
      for (k,v in @fields) {
      if (isabsent($[k])) {
      $[k] = "";
      }
      }

      $ mlr --csvlite put -f rect.mlr Apex_30_Latest.csv Apex_20_Latest.csv Apex_10_Latest.csv
      A,B,C,D
      1,2,3,4
      2,3,4,5
      3,4,5,6
      4,5,6,
      5,6,7,
      6,7,8,
      7,8,,
      8,9,,
      9,10,,


      Since Miller handles named columns intrinsically, header-line management becomes simpler.






      share|improve this answer


























        0














        Here is an answer implemented in Miller:



        $ cat rect.mlr
        for (k,v in $*) {
        @fields[k] = v; # retain already-seen field names
        }
        for (k,v in @fields) {
        if (isabsent($[k])) {
        $[k] = "";
        }
        }

        $ mlr --csvlite put -f rect.mlr Apex_30_Latest.csv Apex_20_Latest.csv Apex_10_Latest.csv
        A,B,C,D
        1,2,3,4
        2,3,4,5
        3,4,5,6
        4,5,6,
        5,6,7,
        6,7,8,
        7,8,,
        8,9,,
        9,10,,


        Since Miller handles named columns intrinsically, header-line management becomes simpler.






        share|improve this answer
























          0












          0








          0






          Here is an answer implemented in Miller:



          $ cat rect.mlr
          for (k,v in $*) {
          @fields[k] = v; # retain already-seen field names
          }
          for (k,v in @fields) {
          if (isabsent($[k])) {
          $[k] = "";
          }
          }

          $ mlr --csvlite put -f rect.mlr Apex_30_Latest.csv Apex_20_Latest.csv Apex_10_Latest.csv
          A,B,C,D
          1,2,3,4
          2,3,4,5
          3,4,5,6
          4,5,6,
          5,6,7,
          6,7,8,
          7,8,,
          8,9,,
          9,10,,


          Since Miller handles named columns intrinsically, header-line management becomes simpler.






          share|improve this answer












          Here is an answer implemented in Miller:



          $ cat rect.mlr
          for (k,v in $*) {
          @fields[k] = v; # retain already-seen field names
          }
          for (k,v in @fields) {
          if (isabsent($[k])) {
          $[k] = "";
          }
          }

          $ mlr --csvlite put -f rect.mlr Apex_30_Latest.csv Apex_20_Latest.csv Apex_10_Latest.csv
          A,B,C,D
          1,2,3,4
          2,3,4,5
          3,4,5,6
          4,5,6,
          5,6,7,
          6,7,8,
          7,8,,
          8,9,,
          9,10,,


          Since Miller handles named columns intrinsically, header-line management becomes simpler.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 12 '16 at 2:16









          John KerlJohn Kerl

          1465




          1465























              0
















              with Miller (http://johnkerl.org/miller/doc/) as usual is very easy



              mlr --csv unsparsify Apex_*_Latest.csv


              gives you



              A,B,C,D
              1,2,3,4
              2,3,4,5
              3,4,5,6
              4,5,6,
              5,6,7,
              6,7,8,
              7,8,,
              8,9,,
              9,10,,





              share|improve this answer


























                0
















                with Miller (http://johnkerl.org/miller/doc/) as usual is very easy



                mlr --csv unsparsify Apex_*_Latest.csv


                gives you



                A,B,C,D
                1,2,3,4
                2,3,4,5
                3,4,5,6
                4,5,6,
                5,6,7,
                6,7,8,
                7,8,,
                8,9,,
                9,10,,





                share|improve this answer
























                  0












                  0








                  0








                  with Miller (http://johnkerl.org/miller/doc/) as usual is very easy



                  mlr --csv unsparsify Apex_*_Latest.csv


                  gives you



                  A,B,C,D
                  1,2,3,4
                  2,3,4,5
                  3,4,5,6
                  4,5,6,
                  5,6,7,
                  6,7,8,
                  7,8,,
                  8,9,,
                  9,10,,





                  share|improve this answer














                  with Miller (http://johnkerl.org/miller/doc/) as usual is very easy



                  mlr --csv unsparsify Apex_*_Latest.csv


                  gives you



                  A,B,C,D
                  1,2,3,4
                  2,3,4,5
                  3,4,5,6
                  4,5,6,
                  5,6,7,
                  6,7,8,
                  7,8,,
                  8,9,,
                  9,10,,






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 2 days ago









                  aborrusoaborruso

                  1619




                  1619






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Unix & Linux Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f238046%2fstack-data-from-multiple-delimited-files-into-one-with-variable-columns%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

                      is 'sed' thread safe

                      How to make a Squid Proxy server?