My own mini-scanfests

When you come back home after a productive research trip to an archive or library do you often end up with a stack of photocopies?

Yes, me too.

I use my digital camera whenever I can but sometimes it just isn’t possible to take photos. Sometimes the repository doesn’t allow it, and other times the documents are folded up so well that it is just easier to get the experts to photocopy them. When I get home I tend to leave them for a while in the ‘filing’ pile, and the longer they stay there the harder it is to get around to dealing with them.

For me a major part of the post-research trip process is scanning the photocopies. A piece of paper is no good to me if it fades or gets tea spilled on it, or the laser toner sticks to something other than the paper, or it goes up in a bushfire.

To address the post-research filing issue I bought one of those multi-function printers. It prints in colour and black-and-while, it scans, it photocopies, and it faxes. It’s a marvel of modern technology. When I chose it I made sure of two things –

  1. it prints and scans both sides of the paper (duplex)
  2. it has a document feeder

The duplex requirement is fairly self-explanatory. The document feeder means I can put a stack of pages in the top, press some buttons to tell it to scan to my laptop, and away it goes. All I have to do is press the OK button on the laptop, and then I can get on with something else. If both sides of the page needs to be scanned I can select that option and the pages are scanned in the correct order.

Of course, at some stage I have to rename the files to something more meaningful than SCAN0001.jpg or whatever I’ve chosen as the default, but I can do that later, and sitting down.

My scanner is not much bigger than A4, so A3 photocopies are a problem. There are a couple of solutions – perhaps you have others?

  1. scan each half at a time, making two images that can then be joined together (or not!) in your photo software
  2. photocopy the A3 at a library or somewhere with a big photocopier, reducing it to A4, and then scan the A4 photocopy. Yes, some quality is lost, but it takes much less time and is more likely to result in a useable scan than option 1, which I rarely get around to doing.

Another important part of the process is to write the citation on the photocopy¬†before scanning it, if I hadn’t already done it at the time of the photocopying. If I’ve requested copies at State Records NSW I pay for them before I leave and so this labelling must be done at home, preferably the same day while the file is still fresh in my mind.

Then there’s the analysing, data entry, filing into my family binders, and all of the other tasks that give meaning to whatever I’ve found, but that’s another story.

What do you do with your photocopies when you get them home?

Picasa face-recognition scan conclusions

Picasa face recognitionI have posted previously about letting Picasa 3 scan for faces so I can identify them. I had hoped to publish the results at the time but I was caught up with other things and didn’t get a chance.

Unfortunately I don’t have an accurate record of how long it took. I started it on about the 1st October with 14,000 photos to process. On the 4th it was 50% completed after I had added an additional 5000 photos because I added some of the folders under Documents.¬†On the 5th it was saying all day that it had 51% to go. Then that evening it changed to 52%. I thought it was going to take another week, but the next day it was finished.

That’s 5-6 days. For 19,000 photos.

It ran for 24 hours a day, and I only closed it down occasionally when it was slowing down what I was doing. It used an average of 45% of my CPU, so sometimes this was a problem. I don’t remember the processor that my laptop has, but it’s a bit over 2 years old.

Of course, not all of these photos have people in them – there are landscapes, wildlife, and images of documents.

Some things I have noticed:

  • if I sign in to Google it can get the names from my contacts list
  • it runs very slowly at other times and quite quickly at others
  • it picks up faces from the covers of books and photos on the wall behind the real people
  • it can find faces in very fuzzy pictures
  • it is not bothered by hats and sunglasses
  • it quite often suggests the wrong person but that person is closely related, such as a sister, aunt or grandmother
  • it identifies people more accurately the more photos you have identified
  • it can identify people at all ages in their lives
  • it is better at identifying babies than I am
  • it doesn’t recognise cats, dogs or gorillas, although it did identify one front-on picture of a dog
  • I have a lot of duplicate photos, and when I identify one it suggests the same name for the others very quickly
  • I am terrible at remembering names
  • I nearly have more photos of my nieces than I have of my husband or myself

By the time it finished it said it still had about 6500 faces to identify. I am slowly whittling those down. I now have just over 5000. There are also the faces it can’t identify as faces, which I have to do manually if I want it done at all.

It seems to have trouble with faces if they are:

  • at an angle
  • have hair over one side
  • side-on unless they are completely from the side
  • really, really fuzzy

And yet sometimes it sees a face where there isn’t one. I thought this one must be in the background somewhere.

Panda face

He looks like he has a little beard and a receding hairline.

This is the photo it came from:

Picasa panda

Can you see the face, in the top right corner? Not a face at all!

It also picks up the hundreds of faces in the backgrounds of photos and wants to know who they are. You can mark each one as ignored, and you can see these later if you want to. When the Sydney Harbour Bridge was 75 years old they opened it to the public to walk across, and the photos from that day have many people in the background. Fortunately they are mostly wearing lime green hats so I could quickly exclude them when I saw them.

All the people in a wedding photo can be identified if you have already identified them elsewhere. Even if you don’t know their names you can give them a number, like Wedding 12, and group photos of the same person together. You can then more easily identify the person, or a relative can, when you can see a number of photos of the same person together.

I have had a wonderful time with Picasa, and I still am. I am finally learning, through having to identify photos, which of my grandmother’s three sisters is which, and what my mother’s older brothers looked like when they were young.

I have also very much enjoyed seeing pictures of the same person throughout their lives all in the one place. Here are some of my grandmother Amy Eason nee Stewart:

Amy Millicent Eason nee Stewart

You can see her from the earliest photo of her that I have, when she was a baby; as a teenager, a young mother, and so on all through her life. The photos are of varying quality but the only one I had to manually identify was the blurry side-on one in the 3rd row.

A valuable lesson I learned was in trying to identify what it is that makes this person look like that person. What is it in my face that Picasa mistakes for my grandmother’s? Or two of three nieces but not the third?

To be fair, sometimes Picasa is totally wrong. It tried to tell me that this same grandmother was in a shot of my husband posing with the Wests Tigers rugby league team. It wasn’t. When it ‘groups’ unnamed faces it tends to put faces together that are shot at the same angle. Sometimes I think it is suggesting names based on the frequency with which that name appears, or on the previously identified name, but that might just be my cynicism.

All in all I am so glad I went through this exercise. Identifying faces has become my procrastination-of-choice, and it has made me much more likely to name the faces of photos I have just taken rather than leave it for years when I can no longer remember the names. I am also determined to research the names I should know but can’t remember – school classmates, fellow safari tourists, even Wests Tigers. All those unnamed faces bother me!

Whose face is that? – Picasa 3

I have recently upgraded my Picasa to version 3 and let it start running through my photos looking for faces so I could tag them. Picasa is photo organising, editing and sharing software from the Google people. It’s free.

The scan started two days ago, and it’s now 32% of the way there. Yes, I have a lot of photos. I have restricted to photos in the My Pictures folder for the present, which it says contains about 14,000 photos.

Despite the slowness of it, and the fact that it uses up to half my CPU continuously, I will let it finish. I really like it. I am amazed at how it recognises faces, and find it much more useful than I expected to.

It works like this. It goes through all my folders of photos that you see on the left, looking for faces.

Picasa faces

When it finds one it draws a box around it and asks you who it is. If it thinks it knows it makes a suggestion for you to confirm. Simple!

Picasa bunnies

Out of all the lions in this photo it picked out my niece, Madeleine [sorry Mad]. If I want to ignore the others in the photo I can click on the X.

Eventually, it has a list of people and shows you the thumbnails of the person from each photo in which he/she appears. If it has made a guess then it asks you to confirm. Here you can see some suggestions it has made about photos of me:

Picasa confirm my face

They are all me! I can click on the green tick for each one, or remove the ones that aren’t me and then click on ‘confirm all’.

Where it gets tedious is when it doesn’t recognise what it sees as a face, because it’s tilted at an angle or half in the shade. You can manually draw a box around the face and name it just the same. It also has trouble with fuzzy old black and white photos, although not as much trouble as I feared.

Where it gets interesting is not where the suggestions it makes are correct, but are nearly correct. It chooses siblings or direct ancestors such as parents or grandparents.

Actually, I don’t know whether it’s just going for the law of averages. When it identifies a photo of my grandmother as being me it is very interesting try to work out why. Sometimes it’s a face at a similar angle and lighting to another photo, but sometimes it must be facial similarities.

Try it out for yourself! I’ll let you know when it is finished. It seems to be speeding up, but it will still be some days away.