MovieChat Forums > Contributors Help > REDIRECTED: Possible to search for year ...

REDIRECTED: Possible to search for year in attribute field?


Dan Kaplow has producer credits for Lady Dynamite that have the year 2015 in the attribute section. These should of course be deleted and I'm happy to do so, but I was wondering if it's possible to search for years in the attribute field so all instances where a year is listed in the attribute section can be deleted.

Marco.

reply

Hey, Marco:I was wondering if it's possible to search for years in the attribute fieldNo, it can not be done directly, but IMDb plain text data files may be useful in this situation. Please see SECTION O at https://getsatisfaction.com/imdb/topics/problem-with-also-archive-footage?topic-reply-list%5Bsettings%5D%5Bfilter_by%5D=all&topic-reply-list%5Bsettings%5D%5Breply_id%5D=16894462#reply_16894462, where this issue is covered for acting credits.If you wish me to do so, I can expand the search to all the available crew positions on the FTP server and post a PDF file with most of the instances where a year (or year range) is listed in the attribute or character fields (excluding series main level, where some might be correct), so you (or any other contributor who wants to) can work on it.😃

reply

ljdoncel,

Wow, that is probably the most useful and interesting thread on Get Satisfaction. You (and Dale79) have done a splendid job. 

If you wish me to do so, I can expand the search to all the available crew positions on the FTP server and post a PDF file with most of the instances where a year (or year range) is listed in the attribute or character fields (excluding series main level, where some might be correct), so you (or any other contributor who wants to) can work on it.If you'd be willing to create such a list, I'd be happy to work on it.

That said, I'm now working my way through category P from your Get Sat post. Do you have a link to Dale79's profile so I can contact him about this? (Edit: no need, found his profile)

Also, has the problem reported in category P "P) themselves IN CHARACTER NAME : this is a special case because, unlike others (such as Himself, himself, Herself, herself or Themselves) this all-lowercase word doesn't trigger the internal mechanisms to record the entry as a "self" credit." been reported to staffers yet? Otherwise it would be a good idea to start a new thread on this board about it, so they can fix it. (or at least put it on the list of things they want to fix )

Marco.

reply

You (and Dale79) have done a splendid job.Thank you very much, Marco . but there's still a lot of work to be done. I enjoy adding new data or correcting existing info, but IMDb is a VERY LARGE DATABASE, so I believe that investing some time in collecting those inconsistencies and refining the way the data are stored is worthwhile to improve the integrity of the database.If you'd be willing to create such a list, I'd be happy to work on it. Great, here it is! 😎 Please note that these listings have been generated from data contained in the latest available version of the plain text files actors.list, actresses.list, cinematographers.list, composers.list, costume-designers.list, directors.list, editors.list, miscellaneous.list, producers.list and writers.list, which are dated 13 May 2016, so some entries may have changed (thanks to dale79's and other contributors' work).I've omitted from the PDF files the 174,907 credits af the entire series level because many may be correct (e.g. absence of specific episodes — perhaps I'll analyze this cases in a deeper manner on other occasion to try to retrieve some wrong entries); I've also split into a separate file the acting credits that have the year or year range attached to the character field, given that many could be correct too.*** Summary of Results ****************************CREDITS WITH (year) OR (year range)TOTAL: 275,693 credits in 132,006 names/103,202 titles************************************************▶CREDITS AT SERIES LEVEL: 174,907 credits in 124,368 names/32,151 titles. . . .◽(year) in CHARACTER field: 104,515 credits. . . .◽(year) in ATTRIBUTE field: 70,860 credits (yes, 468 in both fields). . . . . . . .🔹ACTORS: 1,022 credits. . . . . . . .🔹CINEMATOGRAPHERS: 2,860 credits. . . . . . . .🔹COMPOSERS: 2,673 credits. . . . . . . .🔹COSTUME DESIGNERS: 789 credits. . . . . . . .🔹DIRECTORS: 5,723 credits. . . . . . . .🔹EDITORS: 5,225 credits. . . . . . . .🔹MISCELLANEOUS: 18,865 credits. . . . . . . .🔹PRODUCERS: 22,018 credits. . . . . . . .🔹WRITERS: 11,685 credits▶EXCLUDING SERIES LEVEL: 100,786 credits in 11,480 names/71,051 titles. . . .◽(year) in CHARACTER field: 16,889 credits. . . .◽(year) in ATTRIBUTE field: 83,929 credits in 6,471 names/62,567 titles. . . . . . . . . . .⚫ Films: 2,388 credits in 918 names/1,339 titles. . . . . . . . . . .⚫ Made for video (V): 132 credits in 84 names/92 titles. . . . . . . . . . .⚫ Made for TV (TV): 1,306 credits in 618 names/892 titles. . . . . . . . . . .⚫ Videogames (VG): 30 credits in 22 names/26 titles. . . . . . . . . . .⚫ Episodes: 80,073 credits in 2,954 names/3,410 series/60,218 episodes. . . . . . . .🔹ACTORS: 1,026 credits in 537 names/762 titles. . . . . . . .🔹CINEMATOGRAPHERS: 1,194 credits in 207 names/1,129 titles. . . . . . . .🔹COMPOSERS: 2,769 credits in 376 names/2,635 titles. . . . . . . .🔹COSTUME DESIGNERS: 780 credits in 114 names/774 titles. . . . . . . .🔹DIRECTORS: 2,250 credits in 275 names/2,141 titles. . . . . . . .🔹EDITORS: 3,337 credits in 362 names/3,030 titles. . . . . . . .🔹MISCELLANEOUS: 39,414 credits in 2,355 names/30,817 titles. . . . . . . .🔹PRODUCERS: 27,298 credits in 1,772 names/23,135 titles. . . . . . . .🔹WRITERS: 5,861 credits in 754 names/4,109 titlesThe PDF files can be downloaded from my Dropbox account:📄 NAMES: https://dl.dropboxusercontent.com/u/28703693/yearsinattribfield-names.pdfInteresting fact: if someone had removed all the unneeded (years) of just the first 4 names of this list, that contributor would have been one of the 2015 Top 250 😱 (26 names to be top 100...)📄 TITLES: https://dl.dropboxusercontent.com/u/28703693/yearsinattribfield-titles.pdf📄 Supplementary material:⚫Detailed episode list: https://dl.dropboxusercontent.com/u/28703693/yearsinattribfield-episodes.pdf⚫(Years) in character field: https://dl.dropboxusercontent.com/u/28703693/yearsincharacterfield.pdf..has the problem reported in category P "P) themselves IN CHARACTER NAME : this is a special case because, unlike others (such as Himself, himself, Herself, herself or Themselves) this all-lowercase word doesn't trigger the internal mechanisms to record the entry as a "self" credit."been reported to staffers yet? Otherwise it would be a good idea to start a new thread on this board about it, so they can fix it.I'm not sure about this, but "the boss" (Col Needham 😁😁) is a very active user in GetSat (appreciated and exemplary behaviour coming from a CEO of such a popular site, btw) and I'd bet that he read every single post there, so I wouldn't be surprised if staffers were already aware of the problem. Ideal solution would be to amend the current instances in category P so they read Themselves instead of themselves and to add an automatic correction of the all-lowercase words (himself, herself or themselves) at the beginning of a character name into the capitalized forms.

reply

There is a ticket for us to take a look at the clean-up here (#0081501908)

Col

reply

there's still a lot of work to be doneI agree, but judging from the thread at Get Satisfaction, it almost seems like most of it is already fixed, but maybe I'm being a bit of an optimist here. 
I believe that investing some time in collecting those inconsistencies and refining the way the data are stored is worthwhile to improve the integrity of the database.I totally agree. I've said it before and I'll say it again, but being a database contributor also means being a part-time janitor. 
Great, here it is!Thanks for creating the list! I've done a couple from the third page of the names list, but I had no idea the list would be so long. I therefore am very happy with Col's response upthread.
I've also split into a separate file the acting credits that have the year or year range attached to the character field, given that many could be correct too.I agree it's best to look at these at a case by case basis. I think the majority should probably be deleted, but there are some that should remain.

On a related note: For composers, I think there are quite a lot with a year range that should remain. I'm thinking of composers for silent films that have gotten a new score for example. So I'd strongly advise against bulk deletions in that section.

but "the boss" (Col Needham 😁😁) is a very active user in GetSat (appreciated and exemplary behaviour coming from a CEO of such a popular site, btw) and I'd bet that he read every single post there, so I wouldn't be surprised if staffers were already aware of the problemAnd luckily he also has his eyes on this board and has responded to this thread, so with a bit of luck, most issues mentioned here will become a thing of the past within a few months. (although there'll of course always be errors that slip through, but the quantity of it should significantly drop)

Marco.

reply

There is a ticket for us to take a look at the clean-up here (#0081501908)Thanks a bunch Col. Such a clean-up would save all of us a lot of work.

Marco.

reply

Any word on this yet?

Marco.

reply

Hi Marco,

This one is still pending but we'll be in touch via the helpdesk once this has been actioned.

Regards,
Will

reply