Menu

Create a database from your GEDCOM file, PHP Version

14 Juni 2011 - Genealogy, PHP

In response to a special plea, I translated my PERL version for creating a database out of a GEDCOM file to a PHP version. This version is also free for use.Only some personal data  for connection to database and to find the file must be adapted:

#############################################################################
## insert the path to file and the filename
##
$path     = "path/to/gedcom/file/"; # path to gedcom file
$filename = "gedcomfile.ged";       # GEDCOM file
##
#############################################################################
## Database connection
##
$hostport = "localhost";            # database host
$user     = "root";                 # database user name
$password = "";                     # database password
$database = "stammbaum";            # database schema name
##
#############################################################################

Afterwards the script can be called with a browser.

More words are not necessary, because everything is explained in the previous posts (Perl version) (Script for use in browser).

Have fun.

Download gedcomToDatabase.php as ZIP file

Table Structure:

--
-- Table structure for table `famchild`
--
 
DROP TABLE IF EXISTS `famchild`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `famchild` (
`famID` varchar(40) NOT NULL DEFAULT '',
`child` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`famID`,`child`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
 
--
-- Table structure for table `family`
--
 
DROP TABLE IF EXISTS `family`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `family` (
`famID` varchar(40) NOT NULL DEFAULT '',
`husband` varchar(40) DEFAULT NULL,
`wife` varchar(40) DEFAULT NULL,
`marr_date` varchar(255) DEFAULT NULL,
`marr_plac` varchar(255) DEFAULT NULL,
PRIMARY KEY (`famID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
 
--
-- Table structure for table `person_st`
--
 
DROP TABLE IF EXISTS `person_st`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `person_st` (
`persID` varchar(40) NOT NULL DEFAULT '',
`name` varchar(255) DEFAULT NULL,
`vorname` varchar(255) DEFAULT NULL,
`marname` varchar(255) DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`birt_date` varchar(255) DEFAULT NULL,
`birt_plac` varchar(255) DEFAULT NULL,
`taufe_date` varchar(255) DEFAULT NULL,
`taufe_plac` varchar(255) DEFAULT NULL,
`deat_date` varchar(255) DEFAULT NULL,
`deat_plac` varchar(255) DEFAULT NULL,
`buri_date` varchar(255) DEFAULT NULL,
`buri_plac` varchar(255) DEFAULT NULL,
`occupation` varchar(255) DEFAULT NULL,
`occu_date` varchar(255) DEFAULT NULL,
`occu_plac` varchar(255) DEFAULT NULL,
`religion` varchar(80) DEFAULT NULL,
`confi_date` varchar(255) DEFAULT NULL,
`confi_plac` varchar(255) DEFAULT NULL,
`note` longtext,
PRIMARY KEY (`persID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

________________________________________________________________

UPDATE:

Like you all see, Hans had some problems with my script. First and Last name and Mariage name are still empty in Database after using my script. He send me a small GEDCOM file of his own to find the problem and fix the bug.

And I found the problem. I will explain it:

I’m using the genealogical Software of the „The Church of Jesus Christ of Latter-day Saints“ (Mormons). It is called „Personal Ancestral File“ and this FREE SOFTWARE is using and filling absolutely all fields of the GEDCOM format. Most of other software do not use some of this fields, if they are not really necessary. So there is in my file a person always starting like this:

0 @I1@ INDI
1 NAME Carsten Dedo /Fröhlich/
2 SURN Fröhlich
2 GIVN Carsten Dedo
1 SEX M

The problem is, that the two lines with SURN and GIVN are not necessary, because this information is in the line with NAME, too. So they are not including this two line, but my first script was using this lines for the database inserts.

Now I created a second version of this script, which can be used in both versions, with or without SURN and GIVN lines.

For the marriage name I cannot help Hans, because his Software is not including this information (line starts with „2 _MARNM“) into the GEDCOM file. Because of this, I only can recommend to switch to an other software, if you really need the information of marriage name.

new version of PHP script for download.

Bug Fixed version from 05. Sep. 2011:
Gedcom To Database V1.0.2

Newest version: gedcomToDatabase V.1.0.3 (Bugfix in Extended Notes)
Newest version: gedcomToDatabaseV104 (Changes in Extended Notes)
Newest version: gedcomToDatabaseV105 (Changes in Extended Notes)

DB for Version 1.0.4 and 1.0.5:

 

DROP TABLE IF EXISTS `famchild`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `famchild` (
`famID` varchar(40) NOT NULL DEFAULT '',
`child` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`famID`,`child`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
DROP TABLE IF EXISTS `family`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `family` (
`famID` varchar(40) NOT NULL DEFAULT '',
`husband` varchar(40) DEFAULT NULL,
`wife` varchar(40) DEFAULT NULL,
`marr_date` varchar(255) DEFAULT NULL,
`marr_plac` varchar(255) DEFAULT NULL,
`marr_sour` varchar(255) DEFAULT NULL,
`marb_date` varchar(255) DEFAULT NULL,
`marb_plac` varchar(255) DEFAULT NULL,
`marb_sour` varchar(255) DEFAULT NULL,
PRIMARY KEY (`famID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
DROP TABLE IF EXISTS `person_st`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `person_st` (
`persID` varchar(40) NOT NULL DEFAULT '',
`name` varchar(255) DEFAULT NULL,
`vorname` varchar(255) DEFAULT NULL,
`marname` varchar(255) DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`birt_date` varchar(255) DEFAULT NULL,
`birt_plac` varchar(255) DEFAULT NULL,
`birt_sour` varchar(255) DEFAULT NULL,
`taufe_date` varchar(255) DEFAULT NULL,
`taufe_plac` varchar(255) DEFAULT NULL,
`taufe_sour` varchar(255) DEFAULT NULL,
`deat_date` varchar(255) DEFAULT NULL,
`deat_plac` varchar(255) DEFAULT NULL,
`deat_sour` varchar(255) DEFAULT NULL,
`buri_date` varchar(255) DEFAULT NULL,
`buri_plac` varchar(255) DEFAULT NULL,
`buri_sour` varchar(255) DEFAULT NULL,
`occupation` varchar(255) DEFAULT NULL,
`occu_date` varchar(255) DEFAULT NULL,
`occu_plac` varchar(255) DEFAULT NULL,
`occu_sour` varchar(255) DEFAULT NULL,
`religion` varchar(80) DEFAULT NULL,
`confi_date` varchar(255) DEFAULT NULL,
`confi_plac` varchar(255) DEFAULT NULL,
`confi_sour` varchar(255) DEFAULT NULL,
`note` longtext,
PRIMARY KEY (`persID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

 

Print Friendly, PDF & Email

35 thoughts on “Create a database from your GEDCOM file, PHP Version

Hans

Carsten, you did a great job and so quick! There is only one problem I don’t get the names from the gedcom into the database. The fields: name, vorname and marname in the table: person_st are still empty (after a few tries). Perhaps the problem is caused by the software I used to create the gedcom file. I used Aldfaer, Dutch freeware (http://aldfaer.net). I will send you the gedcom file and the: language_NL.properties file for use in I your familytree.zip by mail. Perhaps it helps you find a quick solution to this problem.

Reply
    luventas

    New version is free for download. See update of post.

    Reply
Aivar

Great script !
I used it to convert my Gedcom to sql (just for testing for now) and it seems to have small bug (at least for my gedcom) – my residence place was imported as my birth place. May be because there are 2 PLAC fields on one person data and then first one was just ignored. Below is sample for one person from Gedcom, exported from Geni.com
———————————————-
0 @I1@ INDI
1 NAME Firstname /Lastname/
2 GIVN Firstname
2 SURN Lastname
2 _MAR Lastname
1 SEX M
1 BIRT
2 DATE 01 JAN 1901
2 PLAC MyCity, MyCounty
2 ADDR
3 CITY MyCity
3 STAE MyCounty
1 RESI
2 PLAC MyResidence
2 ADDR
3 CTRY MyResidence
1 _EMAIL my@email.com
1 PHON +49 12345678
1 FAMC @F1@
1 FAMS @F92@
1 RFN online:reference
1 SUBM @I22@
1 OBJE
2 FORM text/html
2 FILE http://www
1 OBJE
2 FORM image/jpeg
2 TITL Title for photo
2 FILE http://www
1 OBJE
2 FORM text/html
2 FILE http://www
1 OBJE
2 FORM image/pjpeg
2 FILE http://www
1 CHAN
2 DATE 01 JAN 2011
3 TIME 09:15:55
——————————————

Reply
    luventas

    you are totally right! There was a bug in the script, because I do not fill this information about residence in my scripts. So I never recognized that.
    I will directly after this reply post my fix version, where the residence information as first fix is now ignored.
    Thanks for your hint!
    Greetings
    Carsten Fröhlich
    http://www.luventas-webdesign.de

    Reply
rené janssen

Hi Carsten,
Nice tool you made here. I wonder if you compleet this tool by adding the sources in a tabel. That would be great.

René

Reply
    luventas

    In which table should I add it? Sorry, but I do not understand this hint. Please explain it a bit more…

    Reply
      René Janssen

      hi Carsten,
      What i mean is this.
      I use the dutch program Alfaer to import the gedcom file.
      for every birth, bapt, death and marriage are sources available. like this one 2 SOUR @S274@
      You need to create extra colums in the person_st and one in the family tabel. example Birth_source, bapt_source.
      After this is done there is to create another tabel because every source has an description
      For @S274 is this
      0 @S274@ SOUR
      1 TEXT
      2 CONC Bidprentje
      In this example it means a Death source

      Another thing what’s going wrong is the death date. if there’s no death date the program used this date
      2 DATE 26 MAR 2010

      0 @I5320@ INDI
      1 RIN 5320
      1 REFN 743
      1 NAME Aldegundis Joannes/van der Aa/
      1 SEX F
      1 DEAT
      1 FAMS @F1269610238@
      1 _NEW
      2 TYPE 1
      2 DATE 26 MAR 2010
      3 TIME 14:58:12
      1 CHAN
      2 DATE 17 MAR 2011
      3 TIME 13:24:26

      René

      Reply
Jos Reintjens

The script works for me until 22500 persons, with 25000 is doesn’t work any more. Is there a maximum in the script or is there another problem. I’am working on a webprogram for myself and got more than 26000 persons in my database. Is there a solution for the problem?
Greetings Jos Reintjens

Reply
    luventas

    Hi Jos,

    26000 Persons is really a lot, but I do not included a maximum value in the script. Maybe you have to extend your Apache settings. Normally there is a maximum time included, how long a script can run. I think, you reach this time limit with the 23000 Persons. If you run the script on your local system or on a server where you are able to administrate, then try to extend this time limit and test it again.

    Greetings
    Carsten Fröhlich

    Reply
      Jos Reintjens

      Hello Carsten.
      I made the time inputs larger. but it didn’t help.
      I’ve got the next screen writings:

      Start reading GEDCOM file
      Start reading Person data
      Start reading family data

      Then it stops.

      The apache time limits are set to:
      Keep-alive timeout 500 (coming from 5)
      Request timeout 1000 (coming from 300)

      I did some changes to the script but i don’t think that’s the problem. The changes are:

      ## create single insert statement
      $insert = „INSERT INTO person_st (`record`, `Achternaam`, `voornaam`, `marname`, `Geslacht`, `Geboortedatum`, `Geboorteplaats`, „;
      $insert .= „`Doopdatum`, `Doopplaats`, `Overlijdendatum`, `Overlijdenplaats`, `Begrafenisdatum`, `Begrafenisplaats`, „;
      $insert .= „`Beroep`, `occu_date`, `occu_plac`, `Gezindte`, `confi_date`, `confi_plac`, `Persooninfo`) „;

      I hope you can point me to the problem.

      Thanks Jos Reintjens

      Reply
        luventas

        Hi Jos,

        you changed the DB-column names in your insert statment. Do you also changed the names of the columns in your MySQL DB?
        Original was:
        DROP TABLE IF EXISTS `person_st`;
        /*!40101 SET @saved_cs_client = @@character_set_client */;
        /*!40101 SET character_set_client = utf8 */;
        CREATE TABLE `person_st` (
        `persID` varchar(40) NOT NULL DEFAULT “,
        `name` varchar(255) DEFAULT NULL,
        `vorname` varchar(255) DEFAULT NULL,
        `marname` varchar(255) DEFAULT NULL,
        `sex` char(1) DEFAULT NULL,
        `birt_date` varchar(255) DEFAULT NULL,
        `birt_plac` varchar(255) DEFAULT NULL,
        `taufe_date` varchar(255) DEFAULT NULL,
        `taufe_plac` varchar(255) DEFAULT NULL,
        `deat_date` varchar(255) DEFAULT NULL,
        `deat_plac` varchar(255) DEFAULT NULL,
        `buri_date` varchar(255) DEFAULT NULL,
        `buri_plac` varchar(255) DEFAULT NULL,
        `occupation` varchar(255) DEFAULT NULL,
        `occu_date` varchar(255) DEFAULT NULL,
        `occu_plac` varchar(255) DEFAULT NULL,
        `religion` varchar(80) DEFAULT NULL,
        `confi_date` varchar(255) DEFAULT NULL,
        `confi_plac` varchar(255) DEFAULT NULL,
        `note` longtext,
        PRIMARY KEY (`persID`)
        ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

        You have to change this also to:
        DROP TABLE IF EXISTS `person_st`;
        /*!40101 SET @saved_cs_client = @@character_set_client */;
        /*!40101 SET character_set_client = utf8 */;
        CREATE TABLE `person_st` (
        `record` varchar(40) NOT NULL DEFAULT “,
        `Achternaam` varchar(255) DEFAULT NULL,
        `voornaam` varchar(255) DEFAULT NULL,
        `marname` varchar(255) DEFAULT NULL,
        `Geslacht` char(1) DEFAULT NULL,
        `Geboortedatum` varchar(255) DEFAULT
        etc.

        And also the column name are needed in the whole application. Mybe it is better to leve the DB clumn names like they are. How they are named in view on website is not related to the DB column name.

        Hope, this will help you.

        Greetings
        Carsten Fröhlich

        Reply
          Jos Reintjens

          Hello Carsten
          I put the original script back. I loaded the file with 23000 persons. It worked. I took the file with 25000 persons, and got the same problem.

          Start reading GEDCOM file
          Start reading Person data
          Start reading family data

          Then I took the total file with 26000+, and I had the next screen writing:
          Start reading GEDCOM file

          Nothing more. I’m not a programmer only a guy who likes to do some coding. I’m out of solutions. Thank for your time and help. If you know something pease help me i’m out of solutions.

          Greetings
          Jos Reintjens

          Reply
          luventas

          I tried it with the original data of Jos and found no problem. It seems to be a problem of the configuration of the system.

          Reply
          Jos Reintjens

          Hello, I found the problem. The memory_limit in the php.ini was set on 128M. I chenged it in 256M and it works great. Carsten Thank you for your help.
          Greetings Jos Reintjens

          Reply
Jan van der Velde

Hello Carsten, first off all, thank you for this script, it helped me enormously.
Unfortunately I miss a few pieces of information and I don’t completely understand your code. Perhaps you can help me in the right direction by pointing to the right steps to take.

First problem: In many old documents from the early 18th and 17th century I can’t find the marriage date but only the date of the marriage announcement. This is the Gedcom MARB tag. To be complete I want both the MARR (marriage) and MARB (marriage announcement) tag added to the database. I understand how to create a new field in the family table but see many different steps in your script that I can’t exactly understand. Can you explain the steps to me ?

Second problem: I really would like to add the Source information, the Gedcom SOUR tag, per event to a table. But as I see it it would take a huge amount of rewriting to the script and even to the tables, or could there be a more easy solution ?

Reply
    luventas

    Hi,

    please give me 2 or three days, then I will change my script to include all data you need.

    Greetz
    Carsten

    Reply
      Jan van der Velde

      That’s very generous..
      Take all the time you need, I’m very glad I don’t need to program it myself.. 🙂

      Reply
Jan van der Velde

Hello Carsten,
Just being curious, but how are you getting along with the adjustments to the script.
I can imagine that adding the „sources“ is a pain.. 🙁
But I would even be happy if you only managed to get the marriage announcements in the database. That should not be that difficult I guess..
Best Regards,
Jan van der Velde

Reply
    luventas

    Hi Jan,

    I was nearly ready with all changes, but the SOUR tag is a little bit more tricky that I thought.
    So I will attac the version 1.0.4 of my script to this POST which is able to write the MARB tag to DB.
    I reduced my changes in the sript to make it running. It is prepared to read the SOUR tags but until now this part is not working. The problem is, that I have no GEDCOM which includes Sources…

    But firstly have fun with the Version 1.0.4

    Reply
      Jan van der Velde

      Thank you Carsten!

      I knew this was tricky.. 🙂

      I can send you a file full off „sources“ if you want.. 🙂
      You have my email address from my post (I presume).
      Contact me and I’ll send you one.

      Reply
Bert

Carsten

I’ve just come about your script (v1.4) and tested it.

But I have come about a nasty problem: I have lots of names with accented E’s and A’s (e.g. Hélène) I can’t seem to get these loaded into the database as they schould. I’ve made changes to the column definitions (Collation) and added a „mysql_set_charset(‚utf8‘,$Conn);“ statement to the script but I continue to get incorrect characters in my DB.

Do you have any idea?

Rgds,

Bert

Reply
    luventas

    You have to add the php function utf8_decode around every single entry, you want to store utf8 decoded in the database.

    Example:
    array_push($person,
    $indi.“;“.utf8_decode($surn).“;“.utf8_decode($givn).“;“.$marn.“;“.$sex.“;“.$birtdate.“;“.$birtplac.“;“.$birtsour.“;“.$chrdate.“;“.$chrplac.“;“.$chrsour.“;“.$deatdate.“;“.$deatplac.“;“.$deatsour.“;“.$buridate.“;“.$buriplac.“;“.$burisour.“;“.$occu2.“;“.$occudate.“;“.$occuplac.“;“.$occusour.“;“.$reli.“;“.$confdate.“;“.$confplac.“;“.$confsour.“;“.$note);

    Then the entries are stored like you want.

    Reply
Howie Milburn

Hi Jan,

I have just tried your V104 script. The script looked like it worked OK – All done being displayed. However, when I checked the MySQL database table person_st I only found one record. Interestingly, this was the last NAME record in the GEDCOM file. My GEDCOM file was produced today from ancestry.co.uk

Can you advise me on this problem

Thanks

Howie

Reply
Howie Milburn

Hi Jan,

Further to my last comment. I have checked the one record in the database and although the name is from the last record in the GEDCOM file all the others fields contain data from various other GEDCOM lines e.g. the birt_sour field contains data from the first record in the GEDCOM file.

Hopes this helps the investigation

Howie

Reply
Howie Milburn

Hi Jan,

I have found the fix to my problem on your ‚Include your family tree in your webpage, new Version‘ page – replace I with P

It might be useful to put this fix on this page – ‚Create a database from your GEDCOM file, PHP Version‘

Thanks for a great script

Howie

Reply
luventas

Hi Howie,

there where two issues, why this is not working.
One is in my header.php, one is somehow in your gedcom file:

The first Person, „Firstname Lastname, @P1“, has no gender given. This is the error message you got.
I changed the script to have female as default gender. It is attached as gedcomToDatabaseV105.php in a zipfile. (It also includes the P in the Personal data, so you do not need to change that)

The second one is a problem on my site: I included somwhere in past the user_id to the database, but it is not filled until now, which means, that it is always 0.
But in the Homepage script it in all database queries it is called with 1, so we do not get any data from database.

What you now have to do:
– Remove all Data from database
– Call the script gedcomToDatabaseV105.php to fill it again
– open the header.php and change the line 3 from „$_SESSION[‚userid‘] = 1;“ to „$_SESSION[‚userid‘] = 0;“, then everything works like it should.

Greetings

Carsten Fröhlich

Reply
Colin

I have just tried the latest version of your script from a PAF exported Gedcom file. I am getting the following error message

„Undefined offset: 1 in C:\wamp\www\gedcom.php on line 141“

(I have renamed your file to just gedcom.php and edited the database connection settings nothing else). Do you have any idea what may be causing this?

Thanks

Reply
Fabien

Hi Luventas,

A great job for your script but i have a problem tu use on V1.05 (With I)
Result :
INSERT INTO famchild(`famID`, `child`) VALUES(“, ‚313I‘);
INSERT INTO famchild(`famID`, `child`) VALUES(“, ‚760I‘);

famID is always empty, probably about the format type, it’s not bigining by „I“ but finishing it
0 @8I@ INDI
0 @2017U@ FAM (I have already replaced F by U on line 360
0 @2010S@ SOUR

Could you help me ?

Thanks

Fabien

Reply
    luventas

    Hi Fabien,

    it is nor enough to replace the F by the U because it is not the correct sequence of regular expression on this place then.
    The Regex „/0\x20\x40(F.*)\x40/“ means Zero, then a whitespace, then a @-sign, then „F“, then 0-n any signs, follewoed by a @-sign.

    „0 @2017U@“ canot be found by this Regex, because the any signs are in front of the U. Please change the line 360 in the script to

    } else if(preg_match(„/0\x20\x40(.*U)\x40/“, $lines[$i], $famindiA)) {

    then it will work for you.

    Greetings
    Carsten

    Reply
Erik Jensen

Hi Luventas
Right now I am working with your the script ver. 1.0.5 so I can get it to work with my gedcom file from The Master Genealogist program. But not all the data is coming in and not in the correct order.
Will it be possible to get some support here with the work or is that to late?

Rgs.
Erik Jensen

Reply
    luventas

    you can send me your questions on my mail address: carsten.froehlich@luventas-webdesign.de

    Reply
Arnold

Just found your site and was playing with the code for your ’stammbaum‘ a bit using Netbeans (NB).
Did not get too far with it as the bits I have found do not seem to mesh all the well.
It looks like the pieces all belong to different versions.
After figuring out that I had to run Sql queries to build the DB, I found the tables entries created did not match up with those expected by the code of gedcomtoDatabase.
One other basic problem seems to be that NB seems to want to edit the files as UTF-8, but complains about unspecified dangers if I do open a file.
Although I am relatively new to both PHP & Netbeans, I did managed to learn a bit more.

In any case, thank you for posting as much as you have.
My main aim in trying this code was to learn more about the interaction of PHP with GEDCOM and that I have been able to do – though it would be more interesting to get it all working as intended.

Reply
M A K

Hi,

I want to create a gedcom file from database in PHP.
Do you have some code to do that ?

Reply
    luventas

    Sorry, I have not written such a code yet.

    Reply
Dave B

Thanks for this, Carsten!

I notice you use mysql_connect, which was removed in PHP 7.0.0.
Would you be able to find time to post a new version of the code to reflect this?
The PHP website says that the alternatives are mysqli_connect() or PDO::__construct()

Thanks!

Reply

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.