I agree with John. You're right that unix cat is a good solution
for concatenating delimited files in identical formats (and I think
you can find windows-based cat programs) but that's an unusual
use-case. The problem that John is pointing out is that PSPP and
SPSS are assigning meta-data to the columns and that process is
imperfect (although, in my experience, it works a lot better if the
data are all generic numeric scalars).
Programs like PSPP work well when you are joining files in more
complex ways. You're basically complaining that flyswatters work
well for killing flies but hammers are less effective and leave big
holes in the wall.
I haven't examined how PSPP handles this, but I had the impression
that SPSS looks at the first few records (maybe it's all the
records, but it seems unduly influenced by early records) to guess
the meta-data, which works reasonably well if there's a single
file. But treating each file independently has the potential to
cause a lot of trouble when there are several files. My worse case
scenario is using SPSS to join several delimited files on an
alphanumeric key (e.g., email or Qualtrics' id or a hash). What I
wish PSPP/SPSS would do is to detect that the keys have different
lengths and either just ignore it or silently increase the length of
the smaller key. If PSPP already does this, kudos!
-Alan
On 11/12/2013 2:04 AM, Ken Singh wrote:
Thank you.
In the case of concatenating the csv files I don't think
format specifiers are essential. As long as the files are of
exactly the same format all variables will align. Once
combined the file could be saved then loaded into the editor
(which guesses the formats for each column). That said, I
understand PSPP is meant to be a clone of SPSS, so likely
there is no good solution available. It may still be the case
that it's more expedient to use cat or or "copy /a" to join
files then import into the editor. I had attempted a
variation of your second suggestion but maybe conceded too
early. I'll play with both of your suggestions. Thanks
again.
K.
_______________________________________________
Pspp-users mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/pspp-users
--
Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.
+815.588.3846 (Office)
+267.334.4143 (Mobile)
http://www.alanmead.org
Announcing the Journal of Computerized Adaptive Testing (JCAT), a
peer-reviewed electronic journal designed to advance the science and
practice of computerized adaptive testing: http://www.iacat.org/jcat
|