Bug #203

source-infobel returns odd characters in CID

Added by tshif over 2 years ago. Updated over 2 years ago.

Status:Closed Start:10/15/2009
Priority:High Due date:10/26/2009
Assigned to:patrick_elx % Done:

100%

Category:-
Target version:Caller ID Superfecta Source Files

Description

The infobel source returns oddball characters embedded within the names in some lookups.

View the results in web form here:

http://www.infobel.com/en/france/Inverse.aspx?q=France

The other view is from inside debug in superfecta.

The number I use is: 33145482390

In debug, The name is captured ok - even the accented e in Dupré André. But at the end of Dupré there is a very odd character. (I think its a form-feed character)

heres what I saw:

badchars.JPG (4.2 KB) tshif, 10/15/2009 08:53 am


Related issues

related to Caller ID Superfecta - Bug #183: wrong charset do not display foreign characters Closed 10/02/2009 10/16/2009
related to Caller ID Superfecta - Bug #215: Choose the CLID charset Closed 10/26/2009 10/26/2009
duplicated by Caller ID Superfecta - Bug #208: Infobel odd character Closed 10/18/2009

History

Updated by tshif over 2 years ago

Updated by tshif over 2 years ago

  • Status changed from New to Reviewed
  • Assigned to set to zorka

Updated by tshif over 2 years ago

  • Status changed from Reviewed to Assigned

Updated by tshif over 2 years ago

  • Target version changed from Caller ID Superfecta Source Files to Caller ID Superfecta - Future Versions

Updated by patrick_elx over 2 years ago

I've replaced the line

1 $sname = str_replace(chr(160), ' ', $sname[1][0]);

by
1 $sname = $sname[1][0];

It solves this problem, but I haven't been able to find out why it was there in the first place.
In the different countries I've tried I did not notice any regression.

Please test with other countries/numbers.

updated rev 185

Updated by tshif over 2 years ago

Interesting. Thats the á character.

Updated by tshif over 2 years ago

  • Target version changed from Caller ID Superfecta - Future Versions to Caller ID Superfecta Source Files

Updated by tshif over 2 years ago

The original contributor passed along to me that the original "$sname = str_replace(chr(160), ' ', $sname10) was in the code because every infobel source returned the "Â" character between the first name and last name in every lookup he did.

Updated by tshif over 2 years ago

tshif wrote:

The original contributor informs (#208) that the original "$sname = str_replace(chr(160), ' ', $sname10) was in the code because every infobel source returned the "Â" character between the first name and last name in every lookup he did.

Updated by tshif over 2 years ago

The original contributor informs (#208) that the original "$sname = str_replace(chr(160), ' ', $sname10) was in the code because every infobel source returned the "Â" character between the first name and last name in every lookup he did.

Updated by tshif over 2 years ago

  • Assigned to deleted (zorka)
  • Priority changed from Normal to High

Duplicate entry deleted.

Updated by tshif over 2 years ago

All -
infobel can not return to the live update repository untill we have a handle on the spurious character issue - and exactly what it might impact. I know we all want it as a source - but we need to get it right - first.

Updated by tshif over 2 years ago

Updated by zorka - copied from #230 which is closed as a duplicate

After patrick changed it (removing chr(160)), it doesn't show up in debug for me, nor on any of the phones we use. It is however there in the database (using webmin or sqlyog to view it) and was giving trouble with a little application we developed internally to update our CRM from the cache. But as it turns out, that application is seldom used, and Superfecta's cache was designed for that.

But I agree, we could use some more testers reporting back weither they have issues with it or not.

Updated by tshif over 2 years ago

  • Status changed from Assigned to Feedback

Updated by patrick_elx over 2 years ago

this page http://www.voip-info.org/wiki/view/Asterisk+func+callerid
shows that we need to limit our caller ID to only ASCII or IA5 character set.

I will try to work on a filter to make sure that will be the only table allowed.

Updated by tshif over 2 years ago

That looks like exactly the issue we are dealing with - awesome reasearch! It will be fascinating to see your results - :-)

Updated by patrick_elx over 2 years ago

I've spent hours on this issue and I'm still stuck.
Here's what I think I figure out so far.

- The curl coming from the source website can have different charset. We need to transform it to utf8 for handling purpose.
Right now we force a utf8_encode in the get_url function. I tried to play with some more intelligent functions to check the charset before encoding (either a simple check for presence of utf8 before encoding or not, or more specific to treat each charset differently). I'm not satisfied with what I had right now. Still working on it but I can use help. However this part does not seem too critical right now.

- We need to split the print caller_id returned as the one we want for the debug will be in utf8 and the one for asterisk should be in ASCII. (just need to add an else in the if not debug to do the utf8_decode).

- Now what's driving me crazy with Infobel is that the utf8_decode before sending back to asterisk solves the different problem we had with this source (the chr160 and other weird issues we had), but it does add one or two lines (or special char?) in front of the CID that you can see in the CLI. First I don't know what are these char in front, and how to get rid of them.

Maybe if someone can look into a real good encoding routine from any charset to utf8, and a good decoding from utf8 to ASCII with replacement of forbidden characters..

I would also suggest that we add a checkbox option in superfecta general config to restrict caller ID to the ITU charset for legacy phones (limited ASCII without any accentuated characters). It should not be needed for SIP phones as they all accept extended ASCII, but if we do some work in this field, adding this option should not be difficult and can help users using FXS with CID.

Updated by zorka over 2 years ago

are you sure there aren't any empty lines outside the php tags in th source ? or that the file is in the correct format (dos, unix) ? those 2 situations caused those empty lines in front of the CID in the CLI before for me

Updated by patrick_elx over 2 years ago

Yes that was it... Thanks
I couldn't see them when doing a direct http lookup..

How a stupid stuff can make me loose my mind.. :-(

Good, let's go back trying to debug the more important part.

Updated by patrick_elx over 2 years ago

  • % Done changed from 0 to 20

uploaded SVN rev 201 (caller_id and infobel).

Please do some test and report to see if that does improve or not.
I've noticed also that my Cisco 7960 are not displaying some extended characters. It's not blocking as it does ring but replace the CID by a blank line. Will try to add some more restrictive filters later on.

Updated by patrick_elx over 2 years ago

check bug #215 and rev 203 that should solve this issue.

Updated by tshif over 2 years ago

  • Status changed from Feedback to QA Testing
  • Assigned to set to tshif
  • % Done changed from 20 to 90

QS: Pass.

Infobel no longer returns oddball character in debug. (Nice work Patrick!)
No disturbances found in the CID presented to telephones. (This is not authorative, I never experienced this error, so I cant test for it satisfactorily.)

Updated by tshif over 2 years ago

  • % Done changed from 90 to 100

QS: Passed. Approved for inclusion in build 2.2.2, and posting to live update repo.

Updated by tshif over 2 years ago

  • Assigned to changed from tshif to patrick_elx

Patrick - is the current version of infobel compatible with superfecta 2.2.1 and lower? Were changes introduced in this source which require 2.2.2 and greater?

Updated by tshif over 2 years ago

  • Due date set to 10/26/2009
  • Status changed from QA Testing to Closed

Also available in: Atom PDF