Extract Portion of data from file name - VisualCron

Please note that VisualCron support is not actively monitoring this community forum. Please use our contact page for contacting the VisualCron support directly.

Welcome Guest! To enable all features please Login or Register.

Ujma
Paid support Topic Starter

2020-06-26T01:27:38Z

Hi ,

I want to Extract date from file name, like for example:

acs_xyzl_abb_cndsD_11_AAP_pqrsty_2020_06_10_1_2019_06_18_1523.xlsx from this I just want to extract "2020_06_10" from this file name, How can I do that ? can you please help me with this.

Thank You.

Sponsor

Forum information

Wanna join the discussion?! Login to your Forum forum account or Register a new forum account

thomas
Free support

2020-06-26T09:06:17Z

I guess the last part is of a fixed length, ie this part:

2020_06_10_1_2019_06_18_1523.xlsx

If that is the case, you can start by extracting this part (the 33 rightmost characters):

{STRING(Right|acs_xyzl_abb_cndsD_11_AAP_pqrsty_2020_06_10_1_2019_06_18_1523.xlsx|33)}

Then you extract the first 10 characters of this, and you end up with:

{STRING(Substring|{STRING(Right|acs_xyzl_abb_cndsD_11_AAP_pqrsty_2020_06_10_1_2019_06_18_1523.xlsx|33)}|0|10)}

Gary_W
Free support

2020-06-26T13:41:08Z

Here's a regex way. It will return the first set of 4 digits followed by an underscore followed by 2 digits followed by an underscore followed by 2 digits. It does not rely on character positioning, but on the fact that the date you want will be in this pattern and the first occurrence of that pattern in the string. I find when creating a regex you first need to be able to describe what you need to extract in plain language and make sure that description satisfies all possibilities you are dealing with. Including the unexpected! What will happen when the pattern is not found? Always expect the unexpected and code for it too. Since the regular expression language can get complicated make sure you understand it and document it in your code for future developers. Code for the person that will maintain this behind you. Don't make them spend 3 days figuring this out when you can include a comment to make it clear and have them coding in minutes. Have you been that person? Don't write expensive code (it cost an extra 3 days developer time!) and have the person behind you curse your name. lol Anyway I'm getting way off track.

{REGEX(Replace|acs_xyzl_abb_cndsD_11_AAP_pqrsty_2020_06_10_1_2019_06_18_1523.xlsx|.*?_(\d{4}_\d{2}_\d{2}).*|$1)}

The regex basically describes the string and defines a capture group around the part you want to keep. The captured group is returned by being referenced as $1.

Explained:
. - Match any character
* - Modifies the previous character. Zero or more of the previous character. So match zero or more of any characters.
? - Make that set of previous characters optional or non-greedy (stop after the first occurrence)
_ - when followed by an underscore
( - Start a group
\d{4}_\d{2}_\d{2} - match 4 digits, an underscore, 2 digits, an underscore and 2 digits)
) - close the group
.* - match the rest of the line. zero or more of any characters.
In the replace section:
$1 - Replace with the contents of the first captured group.

Expect the unexpected!
The entire string will be returned if the match is not found. Make sure that is handled. Also this pattern exists twice in the string. If the filename is malformed and only contains the last pattern, THAT will be returned. Is that ok? If not, the regex may need to be tweaked to ensure the occurrence of that pattern where it occurs at the end is never returned. Consider this:

{REGEX(Replace|acs_xyzl_abb_cndsD_11_AAP_pqrsty_2019_06_10_1_2019_06_18_1523.xlsx|.*?_(\d{4}_\d{2}_\d{2})_\d+_(\d{4}_\d{2}_\d{2}).*\.xlsx|$1)}

It will only return the first occurrence of the pattern, not the last one. (Note it's more complex then it has to be to meet this condition in order to work for the next examples). Remove a digit in the year of the first date. Put it back then change the $1 to $2. Make sure you know the specs of exactly what you need to return before building the regex as it can return the unexpected as well. Play around with the data a little and see what gets returned before using this in production.

Edited by user 2020-06-26T13:59:56Z | Reason: Not specified

Ujma
Paid support Topic Starter

2020-06-26T21:06:09Z

Hi Gary,

you are genius!, Regex one worked perfectly. Thank You So much, much appreciated.

Ha ha ! and yes I am a person who document everything 🙂

Ujma
Paid support Topic Starter

2020-06-26T21:09:31Z

another question:

is there any function available in VisualCron that can convert "Month Name" to it's corresponding number ??

Gary_W
Free support

2020-06-26T21:19:06Z

aw shucks, thanks for the kind words. I've just been around a while :-)

Your should start a new topic with the new question. I don't think there's something built in but that does not mean it can't be done.

Build your own:

{REGEX(Replace|January,1,February,2,March,3,April,4,May,5,June,6,July,7,August,8,September,9,October,10,November,11,December,12|.*?(December),(\d+),?.*|$2)}

The Month name is in a group to make it easier to define where you need to define your month name.

Edited by user 2020-06-26T21:33:42Z | Reason: Not specified

Ujma
Paid support Topic Starter

2020-06-29T17:57:31Z

thanks for your response, I will give it a try.

bweston
Paid support

2020-06-30T21:55:22Z

Originally Posted by: Gary_W

aw shucks, thanks for the kind words. I've just been around a while :-)

Your should start a new topic with the new question. I don't think there's something built in...

Sure there is! ...or it can be abused to be, rather.

{DATEADD(Years|June|MMMM|0|MM)}

Gary_W
Free support

2020-07-01T13:44:45Z

Well that's one for the bag of tricks!