Forum Discussion

GuArdor's avatar
GuArdor
New Contributor

HE Peering Issue

I have been in contact with some admins from NFO servers recently as well as a few months back in October.

I sent them a few trace-routes from my IP address to one of the servers they Host for Battlefield 4 because of some unusually high latency I was getting. Long story short the issue came back recently and they referred to it as a HE Peering Issue that they are attempting to route around so that I can still get reasonable latency. For the most part they have been successful but they wanted me to inform Cox of these issues.

I'm not exactly sure what information will be beneficial for the Cox technicians so I am posting here so that they can tell me who to inform and what information they need.

Here is a summary of one of the responses the NFO admins gave me in an email:

"Previously, Cox was overloading its connection to Level3 in Phoenix, so I had to force our system to avoid using Level3 to reach Cox. That rule is still in place.

Tonight, Cox was overloading its connections to HE, Tata, and some other providers in Chicago, and that was causing high latencies."

Hopefully that makes sense to somebody and I can post more information if necessary. I still have some old trace-route logs from when the problems were occurring that I can post as well.

9 Replies

Replies have been turned off for this discussion
  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    First, it may be helpful to include the relevant data. I am curious how they can tell it's not congestion.

    Second, as good intentioned as they are, it's unlikely a moderator would be able to escalate the data to anyone that would have enough pull to change routing rules on a national level. 

    Last, this page shows latency and packet loss between Level3 and SBC as well as Saavis, AKA Century Link, so I think other providers are having issues with Level3 too. On top of that, this page has alot of people posting issues with Battlefield servers, including different states and different ISPs. This feels like a EA issue.

  • GuArdor's avatar
    GuArdor
    New Contributor

    I have included an example below of my most recent trace-route during the period I was experience high latency. I have at least 1 or 2 older trace routes I can add as well if needed.

    As far as the moderators go I was using that term to refer to several people who responded to me. The problem was actually escalated to a founder/president of the company not just basic server admins.

    I'm not going to pretend I know whats going on with the routing and the problem with Level3 but that seems like the center of the issue. I don't know if Cox has control of that or can do anything about it but perhaps you could enlighten me. Based on what you said it sounds like its in everyone interest to figure out whats going on with this "Level3" thing, whatever it is.

    If more information is required the best I can do is try to get you in contact with the person who was helping me and you guys can try to sort this out.

    Tracing route to c-216-52-143-163.managed-ded.premium-chicago.nfoservers.com [216.52.143.163]
    over a maximum of 30 hops:

      1    <1 ms    <1 ms    <1 ms  192.168.0.1
      2     7 ms     8 ms     7 ms  10.32.0.1
      3     9 ms     7 ms     8 ms  100.127.68.170
      4    12 ms     7 ms     7 ms  70.169.74.146
      5    41 ms    22 ms    19 ms  langbprj01-ae1.rd.la.cox.net [68.1.1.13]
      6    21 ms    24 ms    21 ms  be-204-pe02.600wseventh.ca.ibone.comcast.net [173.167.57.109]
      7    22 ms    22 ms    30 ms  hu-0-5-0-3-cr02.losangeles.ca.ibone.comcast.net [68.86.83.29]
      8     *        *        *     Request timed out.
      9    55 ms    49 ms    51 ms  be-10919-cr01.1601milehigh.co.ibone.comcast.net [68.86.85.154]
     10    60 ms    51 ms    51 ms  be-11719-cr02.denver.co.ibone.comcast.net [68.86.86.77]
     11    68 ms    65 ms    65 ms  be-10517-cr02.350ecermak.il.ibone.comcast.net [68.86.85.169]
     12    65 ms    65 ms    64 ms  be-10563-pe01.350ecermak.il.ibone.comcast.net [68.86.82.158]
     13    65 ms    64 ms    67 ms  96-87-8-110-static.hfc.comcastbusiness.net [96.87.8.110]
     14    69 ms    64 ms    66 ms  border6.po2-bbnet2.chg.pnap.net [64.94.32.75]
     15    74 ms    77 ms    67 ms  inap-b6.e2.router1.chicago.nfoservers.com [216.52.148.254]
     16   157 ms   155 ms     *     c-216-52-143-163.managed-ded.premium-chicago.nfoservers.com [216.52.143.163]

    17   153 ms   153 ms   153 ms  c-216-52-143-163.managed-ded.premium-chicago.nfoservers.com [216.52.143.163]


    Trace complete.

  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    Where did they say the Cox problem was? I don't see Level3 or any of the other mentioned groups in the tracert.  Also, most of the latency occurs between hop 15 and 16, which is on NFO network, the people who host the servers. Can you show the other tracert?

    Also, as per here, NFO says:

    "Our Premium locations are based on a strong foundation of Internap bandwidth, but we use additional peering and transit as well as our own optimizations to further enhance it and bring latencies down to the lowest that we can make them for our customers. Premium locations should offer the best possible latencies in their respective cities -- not just lower than Internap alone, but lower than anyone else out there."

    It looks like they have a Metro-E connection between a Cox Business account and a Comcast Business account and they are using that to route traffic. I wonder if this is part of their "own optimazation" when their peers get congested. It's complex, but there is two kinds of bandwidth, the bandwidth your ISP pays for to put the traffic onto the internet, and what the content provider(NFO) pays to upload their traffic to send it back to you. Inbetween the two, there is usually bottlenecks of the public internet. Your ISP can control how the traffic gets routed inside their network, where it goes to get to the internet, and problems between the two, but once it hits the a network they can't control, all they can do is complain to those that do control it. However complaining to a provider about routing problems is like calling the fire station to report the fire station is on fire....they probably already know.

    Last, it looks like this has been happening for a while. I see reports going back to 2014 showing issues specifically getting to Chicago servers. See forum post here. Here is one from 2016 that goes into more depth about the problem.

  • GuArdor's avatar
    GuArdor
    New Contributor
    [*I had to make several edits to this post due to formatting issues, please refresh your window for latest response.]

    Below I have included the older trace routes as well as a couple relevant email responses I received from the NFO team.

    If you can dissect it and determine a course of action then that would be amazing. I've made a few notes in [*Brackets like this].

    As a side note to everything so far I have recently experience serious packet loss while playing on these servers. So much so
    that it has kicked me from the game about 3 times now. All the while my internet connection remains intact but only well enough
    browse websites.

    \\\\\\\\\\\\\\\\

    [*Early e-mail response in regards to the trace-routes below.]

    It's strange that your ISP is using Comcast to reach us. I haven't seen an ISP besides Comcast do that before, and we
    prepend Comcast to discourage it.

    But, that doesn't seem to be the real problem. The real problem seems to be on the outbound, in that your ISP is
    overloading its connection to Level3 in or around Phoenix, and they are advertising their prefixes in an unusual way
    that is preventing our route optimizer from being able to correctly see that the Telia->Level3 path is poor.

    I've made a manual adjustment now to our system and it is re-optimizing the path to you. You should see an
    improvement within just a few minutes.

    [*Note: I replied to them that the fix worked and they made a mention that they hope Cox upgrades its peering link.]

    \\\\\\\\\\\\\\\\

    [*A few months later I posted the newer trace-route and they had this to say]

    This one isn't the same problem as before, actually. It's another problem with Cox, but it's a different one.

    Previously, Cox was overloading its connection to Level3 in Phoenix, so I had to force our system to avoid using
    Level3 to reach Cox. That rule is still in place.

    Tonight, Cox was overloading its connections to HE, Tata, and some other providers in Chicago, and that was
    causing high latencies.

    This is a new problem that fundamentally isn't on our end, but I'm going to see what I can do to try to work around it.
    I'll have to get more data and do further testing while the problem is occurring. I also encourage you to reach out to
    Cox for an ETR on a fix.

    [*In their latest response to me they mentioned that this issue had something to do with an HE peering link.]

    \\\\\\\\\\\\\\\\

    Tracing route to c-216-52-148-245.managed-ded.premium-chicago.nfoservers.com [216.52.148.245]
    over a maximum of 30 hops:

      1    <1 ms    <1 ms    <1 ms  192.168.0.1
      2     8 ms     7 ms    13 ms  10.32.0.1
      3    26 ms    16 ms     8 ms  100.127.68.168
      4     8 ms    12 ms     9 ms  70.169.74.52
      5    24 ms    19 ms    40 ms  langbprj01-ae1.rd.la.cox.net [68.1.1.13]
      6    21 ms    21 ms    22 ms  be-204-pe02.600wseventh.ca.ibone.comcast.net [173.167.57.109]
      7    23 ms    24 ms    24 ms  hu-0-5-0-1-cr02.losangeles.ca.ibone.comcast.net [68.86.83.21]
      8     *     2672 ms  2704 ms  be-10915-cr01.sunnyvale.ca.ibone.comcast.net [68.86.86.97]
      9    50 ms    50 ms    52 ms  be-10919-cr01.1601milehigh.co.ibone.comcast.net [68.86.85.154]
     10    51 ms    63 ms    53 ms  be-11719-cr02.denver.co.ibone.comcast.net [68.86.86.77]
     11    67 ms    68 ms    69 ms  be-10517-cr02.350ecermak.il.ibone.comcast.net [68.86.85.169]
     12    65 ms    66 ms    67 ms  hu-0-13-0-4-pe01.350ecermak.il.ibone.comcast.net [68.86.85.2]
     13    67 ms    65 ms    67 ms  96-87-8-110-static.hfc.comcastbusiness.net [96.87.8.110]
     14    70 ms    66 ms    65 ms  border5.po2-bbnet2.chg.pnap.net [64.94.32.74]
     15    65 ms    65 ms    65 ms  inap-b5.e3.router1.chicago.nfoservers.com [216.52.143.254]
     16   150 ms   152 ms   154 ms  c-216-52-148-245.managed-ded.premium-chicago.nfoservers.com [216.52.148.245]

    Trace complete.

    \\\\\\\\\\\\\\\\

    Tracing route to c-216-52-143-163.managed-ded.premium-chicago.nfoservers.com [216.52.143.163]
    over a maximum of 30 hops:

      1    <1 ms    <1 ms    <1 ms  192.168.0.1
      2     7 ms     8 ms     7 ms  10.32.0.1
      3     9 ms     9 ms     7 ms  100.127.68.168
      4    11 ms    10 ms     7 ms  70.169.74.52
      5    23 ms    20 ms    21 ms  langbprj01-ae1.rd.la.cox.net [68.1.1.13]
      6    22 ms    23 ms    24 ms  be-204-pe02.600wseventh.ca.ibone.comcast.net [173.167.57.109]
      7    24 ms    23 ms    22 ms  hu-1-0-0-0-cr02.losangeles.ca.ibone.comcast.net [68.86.88.9]
      8    37 ms    37 ms    36 ms  be-10915-cr01.sunnyvale.ca.ibone.comcast.net [68.86.86.97]
      9    52 ms    52 ms    50 ms  be-10919-cr01.1601milehigh.co.ibone.comcast.net [68.86.85.154]
     10    52 ms    55 ms    52 ms  be-11719-cr02.denver.co.ibone.comcast.net [68.86.86.77]
     11    66 ms    68 ms    68 ms  be-10517-cr02.350ecermak.il.ibone.comcast.net [68.86.85.169]
     12    64 ms    68 ms    65 ms  hu-0-14-0-6-pe01.350ecermak.il.ibone.comcast.net  [68.86.86.122]
     13    68 ms    66 ms    66 ms  96-87-8-110-static.hfc.comcastbusiness.net [96.87.8.110]
     14    75 ms    66 ms    66 ms  border10.po1-bbnet1.chg.pnap.net [64.94.32.22]
     15    65 ms    68 ms    66 ms  inap-b10.e5.router1.chicago.nfoservers.com [64.74.97.254]
     16   156 ms   157 ms   159 ms  c-216-52-143-163.managed-ded.premium-chicago.nfoservers.com [216.52.143.163]

    Trace complete.
  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    Haven't had time to read the entire post yet, but what concerns me most is when they say "The real problem seems to be on the outbound,. What outbound do they mean? Outbound from them? If so, that route is decided by them. It seems like the engineer is seeing congestion between them and the different routes they use for outbound and blaming that on Cox. Why isn't he commenting about the 100ms latency that is happening in their network?

    Also, they say there was latency between Cox and Level 3, but none of the tracert show Level3 peering. Can you show the tracert that the engineer was referring to when stating what he said?

    Last, I think it would help to get different people's opinion on this so it's not just me (as Cox) pointing the finger at NFO and NFO pointing the finger at Cox. Try posting a thread on DSLReports and link to this thread. There are engineers, both official Cox registered ones and anom. floaters so I think they will be able to comment and possible fix  the issue if there IS a issue on Cox side. Not saying it isn't possible there is a issue on Cox end, I just haven't seen the data for it yet.

  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    Here is a tracert I pulled from this thread on DSLReports that DOES show a possible peering issue between Cox and Level3. This was 3 years ago, so no relevant to your problem, but it shows the kind of tracert that would match what NFO is talking about, however it looks nothing like your tracerts.

    1 * * * Request timed out.
    2 8 ms 7 ms 8 ms 172.21.1.52
    3 11 ms 20 ms 7 ms 70.169.75.248
    4 17 ms 19 ms 28 ms 72.215.229.61
    5 9 ms 9 ms 9 ms ae56.bar1.Phoenix1.Level3.net [4.31.188.61]
    6 51 ms 52 ms 83 ms ae-2-3514.edge2.Atlanta4.Level3.net [4.69.150.16
    5]
    7 52 ms 52 ms 52 ms gtt-level3.Atlanta4.level3.net [4.68.63.158]
    8 63 ms 63 ms 64 ms xe-4-3-0.lax21.ip4.gtt.net [89.149.182.166]
    9 64 ms 63 ms 74 ms internap-gw.ip4.gtt.net [77.67.70.58]
    10 67 ms 65 ms 67 ms border2.po1-20g-bbnet1.lax014.pnap.net [216.52.2
    55.52]
    11 65 ms 65 ms 66 ms c-66-150-155-210.internap-la.nfoservers.com [66.
    150.155.210]

    Ask NFO if the have a mirror portal, or some way of seeing how the traffic gets from them to Cox. The engineer is probably looking at that.

  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    GuArdor said:
    they are advertising their prefixes in an unusual way
    that is preventing our route optimizer from being able to correctly see that the Telia->Level3 path is poor.

    I think this is the most important part. NFO uses a system to dynamically route traffic to the ISP with the lowest latency. However low latency doesn't mean robust bandwidth. IMO NFO needs to communicate with Cox of this issue. Getting a Cox engineer to look at the issue may help, but the request would come better direct from NFO engineering team. 

    GuArdor said:
    I haven't seen an ISP besides Comcast do that before, and we
    prepend Comcast to discourage it.

    NFO needs to prepend Cox to change the way they update route prefixes. I am confused why he would admit that these conversations happen between NFO and Comcast but then ask the user of a NFO customer to try to get Cox to contact NFO? It seems inefficient.

    GuArdor said:
    but I'm going to see what I can do to try to work around it.

    It seems he does escalate the ticket, which I hope leads to that conversation. If not, DSLReports is your next best bet. This forum has 10-15 other dedicated users, but I don't think any of them are Cox engineers. If there are, PLEASE POST. You can also email Cox at  cox.help@cox.com  . For DSLR  Chris or Jason  are two registered Cox employees which I think work in the AZ SOC. 

  • Tecknowhelp's avatar
    Tecknowhelp
    Valued Contributor II

    I tried to bump the thread, but I am IP blocked from my DSLR account..for the 5th time. It's almost like Cox doesn't like me on there. XD